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PREFACE 

This book gives examples of the uses of elementary statistical methods 
in the design and analysis of experiments carried out in industrial plants 
and scientific laboratories. It also deals with several of the statistical 
features of the problem of establishing a systematic program through 
which the quality of industrial output can be studied and controlled. 
There is a final chapter on some of the statistical aspects of the relation- 
ship of sampling to the risks incurred by producers and buyers. 

Those parts of the chapters in large type are meant to be usable by 
themselves; and they are intended for students, experimenters, and 
production men who are short on mathematical training. The notes 
in smaller type are included for the benefit of those who wish to go a 
little beyond the literary exposition of methods. These notes consist 
of comments on methods, rather detailed derivations, and mere outlines 
or suggestions of derivations. Some of the most important topics have 
been noted in the latter fashion for it is, unfortunately, true that many 
statistical techniques which have long served industrial statisticians 
well require rather advanced mathematics for their complete deriva- 
tion. 

The manuscript of this book has for the past several years formed 
the basis of a one-semester course in industrial statistics, Economics 38, 
given at the Massachusetts Institute of Technology. Students in this 
course are not expected to have had previous training in statistics. 

The intermediate steps of many of the examples are omitted, but final 
answers are given. These partially complete examples can be used in 
assignments to students. 

Those who work on industrial problems are aware of the obstacles to 
entirely successful use of statistical methods in industry. In particular, 
the lack of complete equivalence between industrial reality and our 
mathematical models thereof and the many technical complexities of 
manufacture and research make it advisable that our results be taken 
as tentative. Only those who are thoroughly familiar with the indus- 
trial or experimental process at hand can obtain the full benefits of the 
simple statistical methods described in this book and in other works of 
this character. In numerous instances in this book, my knowledge of 
the technical processes underlying the data under discussion is slight 
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and consequently my conclusions may have dubious practical signifi- 
cance. 

I am deeply indebted to two former colleagues, Mr. Harold Bellinson, 
now of the War Department, and Mr. L. C. Young, now with Westing- 
house, and to Mr. Churchill Eisenhart, of the University of Wisconsin, 
for many suggestions. Mr. Young and Margaret Z. Freeman have 
kindly carried out many of the computations. I am also indebted to 
our department secretaries, Miss Ethel Downer and Miss Eleanor 
Prescott, for typing the manuscript. Acknowledgments to those who 
have kindly permitted me to use their tables and their data are made 
elsewhere in the book. 

I shall be glad to receive criticism and suggestions from readers. 



H. A. FREEMAN 



MASSACHUSETTS INSTITUTE OF TECHNOLOGY 
CAMBRIDGE, MASSACHUSETTS 
May, 1942 
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Statistical procedure and experimental design are only two 
different aspects of the same whole and that whole is the 
logical requirements of the complete process of adding to 
natural knowledge by experimentation. 

R. A. FISHER 
The Design of Experiments 



CHAPTER I 
THE DIFFERENCE OF TWO MEANS 

1.1 Object of Chapter. We wish to design an experiment so that 
from two samples of data we can answer two questions: (1) are the 
averages in the two larger sources of data, from, which the samples were 
drawn, equal or unequal; and (2) if they are unequal, within what 
limits can the value of the inequality be established? It is the purpose 
of this chapter to discuss conditions which should be satisfied by such an 
experiment, to illustrate a method of arranging the experiment so that 
even with a small number of observations the precision of inferences 
will be high, and finally to describe the relevant techniques for analyzing 
the data. 

The design and analysis of experiments involving more than two 
averages will be considered in the following chapters. 

1.2 Examples. Experiments having the objective stated above are 
performed in many branches of science. To give a few examples: in 
medicine and biology studies have been made of the effect of a certain 
amount of thymophysin (as compared to none) on blood pressure; also 
the difference in the number of bacterial colonies per plate when counted 
in the afternoon and in the evening. From industry and agriculture we 
have the difference in the effects of indoor and outdoor storage on the 
breaking strength of wood, comparison of the ash content of coals taken 
from two mines, and the difference in the yield of a particular variety of 
wheat under two types of fertilizer treatment. 

1.3 Uses of the results. From such experiments two kinds of infor- 
mation may be wanted. First, what factor or factors are responsible 
for any observed difference in sample averages; and second, as a result 
of the experiment, what action should be taken? We shall consider 
these questions separately. 

The difference in sample averages may be accidental rather than real, 
for samples can have different averages and yet the larger sources of 
data (to be known as populations), from which these samples were 
drawn, may have the same averages. If, however, the difference in 
sample averages is shown to be real, this difference may always be 
attributed separately to the influence of one or more factors and/or 

1 
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to the joint influence of two or more factors. If the experiment is 
designed so that the two samples are unlike only with respect to one 
factor, that factor is held responsible for any real difference in averages 
that may be found. But the samples may be unlike with respect to 
several factors. For example, if it is shown that outdoor storage 
adversely affects the bending strength of wood, the factors responsible 
for the adverse effect may be sunlight and/or rainfall. It is even possible 
that the adverse effect is chiefly due to the joint action of sunlight and 
rainfall, these factors separately having slight influence. This prelimi- 
nary experiment must then be followed by a further set of similar 
experiments in each of which both samples are alike with respect to all 
but one of the suspected factors; in this way, the responsible factor or 
factors can finally be identified. 

It is possible, however, to plan the original experiment so that it alone 
will yield all this information; examples will be discussed in detail in the 
second chapter. 

Laboratory, factory, and field experiments do not by themselves 
provide sufficient information to determine economic policy. For 
example, an experiment shows that the bending strength of wood is 
impaired by outdoor storage. Users of wood may, however, be partly 
interested in another quality characteristic, such as hardness, which by 
a similar experiment can be shown to be unaffected by outdoor storage. 
Users will, therefore, be willing to pay only a fraction of the premium 
arising from the additional cost of indoor storage and the determination 
of that fraction clearly depends on facts not supplied by either experi- 
ment. If between two methods of manufacture or two types of product 
no real (statistical) difference is found, users will have no definite 
preference and producers will favor the method or product involving the 
lesser cost. When a real difference is found, and if that difference is 
practically significant, the resultant shift in the preference of users, as 
well as cost differences, will determine the effect on the market of the 
results of the experiment. 

1.4 Problems facing the experimenter. We shall now consider a 
specific example, but the reader should be able to apply the discussion 
to any experiment involving a difference of two averages. 

An experimenter wishes to determine whether or not the average 
amounts of corrosion of two types of wrought ferrous pipe coating are 
the same. The two types are open-hearth iron and puddled iron. He 
selects several specimens of each coating, buries them in the soil and, 
on later removing them, measures the corrosion of each specimen. If 
the amount of corrosion of open-hearth coating is designated by 
the variable X and that of puddled-iron coating by the variable 7, 
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his data on p specimens of the former and q specimens of the latter 
are as follows: 

X l Y l 



In carrying out this experiment, he will have had to make several 
important decisions, and on them depends much of the reliability of his 
later inferences. First, should factors that can be held constant be 
allowed to vary? For example, should all specimens of pipe coating be 
of the same size, be buried in the same type of soil, at the same depth, 
covered with the same backfill, and be left in the soil the same period 
of time? Second, what account can be taken of uncontrollable factors, 
such as the weather after the burial of the specimens? Third, how are 
test specimens to be selected from the larger sources of supply, how 
many should be taken, and should there be a like or unlike number of 
each type of coating? Fourth, what index of corrosion shall be used? 
This is only a partial list but it covers the types of questions that must be 
answered in any experiment of this kind. 

1.5 Desirability of control. Let us first compute the arithmetic 
means X and 7 of the two samples. These averages are given by 

^i "f" X2 + '+ X p 



f _ Yi + Y 2 + + Y 9 



and they can be regarded as estimates of the respective population 
means % f and 7'. We will consider the magnitude of % ~ P in the 
light of the tentative hypothesis that the population means, %' and P' 
are equal. If % 7 is not zero and if we can show that its depar- 
ture from zero was not accidental, the hypothesis X' P' = will 
be rejected. 

Confidence that % F is an accurate measure of % f P' is in part 
dependent on the variability among observations in the populations 
from which the two samples are drawn. Assume that a sample can be 
drawn so that the unknown variability of the variates in the population 
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is proportional to the calculated variability of the variates in the sample. 
If, then, the sample variates X\, X 2 , , X p differ slightly from each 
other, X will be a relatively reliable measure of the population mean X' 
in the sense that further sampling from the population would not greatly 
affect X. If X\, X 2 , , X p vary considerably, X is a less reliable 
estimate of X* '. This would be the case if, for example, open-hearth 
coatings were buried in several types of soils which differ in their cor- 
rosiveness or if some specimens were removed from the soil before 
others. The argument is similar for Y. 

1.6 Pairing. In the example given, variability among Xi, X 2 , , 
X p and among YI, Y 2y , Y q can be reduced to a minimum by using 
specimens all of the same size, burying them in the same soil and for the 
same length of time, etc. The precision of inference will be thereby 
improved, but the great disadvantages of this type of experiment lies 
in its reduced generality and in the practical difficulty of performing 
such an experiment at all. Complete control over all relevant factors 

pipe size, kind of soil, and burial period is practically impossible 
to achieve in ordinary experimenting. 

The arrangement shown in the following table attains both objectives 

practicality and precision. First, it makes possible the introduction 
into the experiment of variability in such important factors as type of 
soil and period of burial, thus making the experiment practically feasible 
and allowing it to simulate conditions of industrial life; and second, it 
excludes the influence of the variability of these factors on the precision 
of inferences relating to the arithmetic means. 



Kind of soil, length of burial 


Corrosion 


Open-hearth 
iron coatings 


Puddled- 
iron coatings 


Difference 


Clay, A years 
Cinders, B years 


Xi 

X* 


Yi 

F 2 


di = Xi- Yi 
<Zs - Xi - 7 2 


Loam, C years 


x n 


Y n 


d-X n -Y* 


Mean 


X 


Y 


d = X -Y 



Each of the quantities di, d 2 , - , d n is unaffected by differences 
among various soils and the various lengths of burials, for in each pair- 
ing both kinds of pipe are treated alike with respect to these factors. 
Hence the error of d tends to be small. At the same time, the experi- 
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ment manages to include the various soil types and lengths of burial 
occurring in the ordinary industrial use of these coatings. Even the 
uncontrollable factor, variable weather, is also introduced and its effect 
likewise excluded, for the two specimens in any pairing will likely be 
buried side by side. One must, however, recognize the fact that the 
results of this experiment are not as reliable for a particular combination 
of the influential factors say cinders, B years burial and heavy rain 
after burial as an experiment in which all 2n observations were 
devoted to that combination. 

In a single experiment a very wide range in the nature of these combi- 
nations of influential factors is disadvantageous. Thus, in muck soil, 
the superiority of open-hearth coverings may be much greater than in 
any other soil, that is, the value of d will be relatively large in that 
pairing. The increased freedom allowed the experimenter by the 
inclusion of this kind of soil may be offset by the increased variability 
of the variates di. If this unusual reaction to muck soil is already 
familiar to the experimenter, then muck soil should not be included in 
the present experiment, for its inclusion is uninformative and the loss in 
precision is costly. 

It is important to note that the estimate, from the paired results, of 
the true error of the mean difference d is based on the n variates di 
whereas in the case in which d was formed from unpaired observations, 
the estimate of error is based on the 2n variates X* and Y{. In each 
example we shall have to determine whether the increased precision of d 
resulting from the reduction of variability due to pairing is or is not 
offset by the loss of precision due to a 50 per cent reduction in the 
number of variates. 

1.7 Randomization. The two objectives, precision and practicality, 
are achieved by pairing. The remaining objective is to avoid bias, and 
this can be achieved by randomization. 

It may happen that certain influential factors cannot be handled by 
pairing. In such a case, the influences of these factors cannot be elimi- 
nated, but they can be distributed so that our comparison of X and F 
is not vitiated by their presence. To illustrate the point, assume that 
in the present experiment the orientation of specimens in the soil might 
influence the amount of their corrosion. If, then, all open-hearth 
specimens are buried in the east side of each excavation and all puddled- 
iron specimens in the west side, any conclusion that, say, open-hearth 
coatings are better than puddled-iron coating is now assailable on the 
ground that the east side may have been a favorable position. This 
possibility can be precluded simply by assigning positions to the speci- 
mens of each pairing in random fashion, for example, by tossing a coin. 
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Randomization provides a completely objective technique of removing 
the possible systematic effects of uncontrolled factors, effects which if not 
randomized might vitiate the comparison of the means. 

Size of pipe, i.e., exposed area, somewhat affects corrosion as measured 
by depth of pits. This factor has not been included in the pairing 
arrangement for the resultant need of drawing a fixed number of each 
size of pipe from the population would interfere with the simple sampling 
technique to be discussed in the next section. This factor should there- 
fore be randomized. The same argument applies to any factor. If 
length of burial is not included among the controls in the pairing arrange- 
ment, random selection of burial periods will preclude the possibility 
that this factor will vitiate the results something which might 
happen if longer burial periods were unfortunately associated with 
one kind of coating. It is disadvantageous to randomize a factor 
which could be controlled by pairing, for the effect is to increase 
the variability among the di and therefore the error against which d 
is judged. 

1.8 Selection of specimens. The method of selecting specimens of 
each type of coating must be one which will not vitiate the experi- 
menter's conclusions. For example, if, as a result of biased sampling, 
the open-hearth coatings used in the experiment are better on the 
average than those in their population while puddled-iron specimens 
are, on the average, poorer than those in their population, any infer- 
ence of the nature of d'( = X* F') from the observed data will be 
vitiated. As a second example, if as a result of biased sampling, the 
open-hearth specimens in the sample are more uniform in amounts of 
corrosion than the specimens of their population, an incorrectly high 
precision will be placed on X. 

Such bias can be avoided by selection of the specimens for each sample 
in such a way that all specimens of the corresponding population have an 
equal opportunity of being drawn. Such random selection may be 
carried out in the following way: Assume there are 30,000 open-hearth 
specimens in the population and 40 are to be drawn. Assign numbers 1 
to 30,000 to the specimens in the population. From any page of a 
table of random numbers (numbers composed of randomly selected 
digits) write down in order five-place numbers (omitting all numbers 
over 30,000) until the numbers of 40 specimens have been drawn. 
Similarly for puddled-iron specimens. Among such tables is one by 
Fisher and Yates (16) in which the digits were obtained from the 15th to 
the 19th digits of a set of 20-place logarithms. The direct approach 
would be to draw at random from a well-mixed bowl of 30,000 chips 
marked from 1 to 30,000, but the labor of marking is great. 
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The purpose of random selection is clear but random selection in 
practice may be difficult. For example, in dealing with fibers it is 
impossible to assign a number to each specimen in the population; 
furthermore precaution will have to be taken to avoid the tendency to 
draw the longer fibers. Chance may select a specimen from the center 
of a rug, or the specimen of pipe whose number is drawn may be at the 
bottom of a pile of thousands of specimens. These difficulties necessi- 
tate compromises but every effort should be made to remove subjective 
decision and its attendant biases from the method of selection. 

1.9 Size of the experiment. The number of specimens to be used 
in an experiment is related to (a) the expected value of the mean differ- 
ence, (6) the variability of the variates in the population, and (c) the 
confidence with which our conclusions are to be stated. If in two 
experiments factors (a) and (b) are the same, then the greater the 
desired degree of confidence, the larger must be the size of the samples. 
If (a) and (c) are the same, the greater the variability, the larger the 
size of the samples. If (b) and (c) are the same, then the greater the 
expected value of the mean difference, the smaller the size of the samples. 

Another important influence on the size of the experiment results 
from the fact that the variability of the variates in the population, 
whether large or small, must be estimated from the samples. This 
estimate is subject to error, and this error is reduced by use of larger 
samples. 

These general considerations do not enable an experimenter to decide 
whether he will need 10 or 50 specimens. Full information on fac- 
tors (a) and (b) may be available only when the experiment is com- 
pleted. If advance estimates of the magnitudes of (a) and (6) can be 
made, the proper value of the size of each sample, n, can be approx- 
imated by formulae to be developed presently. 

1.10 Quality characteristics. Users of an industrial product are 
often interested in more than one of its qualities. For example, both 
hardness and tensile strength may be important. Two possibilities 
are open to the experimenter: (a) he may conduct separate experiments 
for each quality characteristic or, (b) if not more than one quality 
characteristic necessitates a destructive test, he can obtain data on all 
characteristics from one experiment. In the case of hardness and 
tensile strength (6) would apply, for the test for hardness is not 
destructive. 

For any one quality characteristic several measures may be available. 
For example, corrosion may be measured by loss of weight or by depth 
of maximum pits. The experimenter should generally choose a measure 
which varies continuously in preference to one which can assume only a 
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few values. An experiment on corrosion using loss of weight or depth 
of pits (both of which vary continuously upwards from zero) yields 
more information than a like-sized experiment in which amount of 
corrosion is measured simply as high, medium, and low. Such a crude 
classification conceals information which a continuous measure reveals. 
We shall not discuss methods appropriate to this crude type of classifi- 
cation, although for a few industrial products it may be the only type 
available. 

1.11 An experiment in detail. Thirty specimens, fifteen of each 
type of coating, are drawn at random from their respective populations. 
One specimen of each type of coating is included in each pair; each 
pair is buried in the same soil, in similar positions, at the same depth and 
for the same period of time. The various pipe sizes, ranging from 
1 inch to 1 1 /2 inches, are randomized. The results follow: 



Controls 


Depth of maximum pits (expressed in 
thousandths of an inch) 


Kind of soil 


Length of 
burial (years) 


Open-hearth 
iron coatings 


Puddled- 
iron coatings 


Difference 


Clay 


4.5 


73 


51 


+22 


Clay 


3.8 


43 


41 


+ 2 


Cinders 


7.1 


47 


43 


+ 4 


Cinders 


6.1 


53 


41 


+12 


Peat 


2.0 


58 


47 


+11 


Tidal marsh 


4.4 


47 


32 


+15 


Loam 


5.5 


52 


24 


+28 


Clay 


9.2 


38 


43 


- 5 


Clay 


8.5 


61 


53 


+ 8 


Clay 


8.0 


56 


52 


+ 4 


Loam 


5.7 


56 


57 


- 1 


Clay 


3.2 


34 


44 


-10 


Clay 


4.2 


55 


57 


- 2 


Loam 


6.6 


65 


40 


+25 


Alkali knoll 


6.4 


75 


68 


+ 7 



1.12 General nature of the test of the hypothesis d! = 0. The test 
of the hypothesis d f ='0 proceeds as follows: First, considerable infor- 
mation regarding the distribution of di in the population is assumed to 
be at hand. Now assume that from this population of di a very large 
number k of random samples each of n specimens have been drawn and 
the mean of each sample computed. It will be found that means which 
depart considerably from the population mean (df = 0) occur infre- 
quently whereas means near 3' = occur frequently. The frequency 
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g and the fractional frequency or probability g/k of samples whose 
means depart from the population mean by a given amount is thus 
experimentally determinable. We may then determine by actual count 
the probability P of a departure from the population mean as large or 
larger than that actually observed. As a matter of fact, the distribu- 
tion of sample means is mathematically determinable, so there is no need 
for the laborious experimental approach to the determination of P. If P 
is large, the observed difference | d df \ is attributed to the vagaries 
of sampling; if P is small, the difference | d d! \ is taken to be real 
and the hypothesis d! = is rejected; the two materials under investi- 
gation are said to be significantly different in their means. 

1.13 Normality of the population. One of the facts assumed to be 
known of the population is that the frequencies fi of values of d in the 
population are normally distributed ; that is, 



in 



where N is the total frequency (total number of observations) in the 
population (N can be assumed to be infinite) and a- is the standard 
deviation of d, the nature of which will be discussed presently. Just as 
the equation y = ax represents a straight line with slope depending on 
the value of the parameter a, so [1] represents a normal distribution of 
frequencies with exact shape and position depending on values of the 
parameters d f and 0. 

Three simple properties of [1] may be noted here. The squared 
exponent of e shows that the frequencies of + (d d') and (d d') 
are equal for any d, that is, the distribution is symmetrical around 
d = df . The maximum frequency 
occurs when the exponent of e is 
zero, which is at d = d'. Finally 
/ approaches zero as (d 3') 
becomes large. If the probability 
f/N is plotted against the deviation 
d d', we have the following curve, 
for fixed values of the parameters 
3' and <r. *** 

The technique of testing the hypothesis that the population is normal 
is similar in general nature to the test of the hypothesis d' = and to 
practically all other tests that will be made in this book. From the 
data of the sample we compute one or more constants whose values are 
known for a perfectly normal population. Then allowance is made for 
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the fact that even from a perfectly normal population a random sample 
having non-normal characteristics may by chance be drawn, particu- 
larly if the size of the sample is small. If, for the particular value of n 
used, the departures of the sample constants from the known normal 
values are greater than can be so allowed by chance the hypothesis that 
the sample came from a normal population is rejected. 
Two such constants* are 



_ Third moment of the population about its mean N_ 

(Second moment of th e population about its mean) 3 /2 



and 



Mean deviation of the population N 



(Second moment of the population about its mean) 1/2 p^ -/\2~] 

L N J 



For a normal distribution vpi = and a = V2/?r. The former is 
obvious because a normal distribution is symmetrical around its mean; 
hence any odd moment about the mean will be zero. Assume that 
from a normal population a very large number of samples, each of size n y 
are drawn and for each sample two constants V&i and a are computed, 
where 



Third moment of the sample about its meant 

: i - 

(Second moment of the sample about its mean) 3/2 p 
and 



Mean deviation of the sample 



(Second moment of the sample about its mean) 1 ' 2 
and 

If the resulting distribution of the frequencies of values of &i is 
examined, it will be found that values near zero occur most frequently. 

* These parameters should be defined here and elsewhere in this chapter in terms 
of integrals but the summations used here should cause no difficulty. 

t Properly, third moment of the elements of the sample about their mean, etc. 
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The form of this distribution has been successfully approximated and 
Table I shows values of v&i, for given n, beyond which 5 per cent and 
1 per cent of all values of Vb\ of random samples from a normal popula- 
tion are found. Similar information on a is shown in Table III. 

It will be noted that 5 per cent and 1 per cent of the frequencies may 
be interpreted as 5 per cent and 1 per cent of the area under the fre- 
quency curve. Thus, for n = 40 the graph of V&i illustrates the 
situation. 



Brdbabilfty 




- 0.587 



+ 0.587 



vs; 



Similar arguments hold for a, the distribution of which is not sym- 
metrical about V2A ( = 0.798). Thus for n = 41, the graph of a 
illustrates the situation. 




0.7470 



0.798 



0.8540 



In the present example on the corrosion of pipe coatings we find 



= 0.330 
a = 0.814 



If fewer than say 1 per cent or 2 per cent of random samples of size 
n = 15 yield values departing by as much as or more than 0.330 and 
0.016 from the expected values and 0.798, the sample at hand cannot 
be considered to have been drawn from a normal population. From 

Table I, 1 per cent of all random samples have values of Vb\ exceeding 
1.061 (for sample size 25, the first entry in the table). Now, the spread 
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of the distribution of the vb\ is less for large than for small n. Hence, 

more than 1 per cent of all random samples of size 15 have Vb\ > 0.330, 

and the hypothesis of normality is not refuted. 

From Table III, the 1 per cent 
levels of a are approximately 0.92 
and 0.68. Our value a = 0.814 is 
within this range. Hence the hypo- 

I* ~ A thesis of normality is not refuted by 

this second (and independent) test. 
For small samples these tests are 
sensitive only to large departures 
from normality. The diagram 



& 



t* shown below of the sample data 

- ^ ^ - ^ 7* = appears somewhat non-normal in 

f S 3 skewness but the Vb\ test, which is 

| S s a test of skewness, did not offer 

1 support. 

1.14 Variance of the population. It has been noted that the relia- 
bility of d depends in part on the variability in corrosion of the speci- 
mens in the population, so a knowledge of the amount of this variability 
is necessary. One measure which would seem reasonable is 

N 



N 
but this is not useful, for its value is always zero. 

(d - d') = Ed - Nd' = Nd' - Nd' = 

A second possibility is the average of the sum of the absolute values 
of deviations of observations about their mean, i.e., the mean deviation 

E\d-d'\ 



N 

This is algebraically an inconvenient measure and it does not fit well 
into the general body of statistical theory. The best measure of 
variability is the standard deviation <r, which has already been intro- 
duced as a parameter of the normal distribution. 



-d'f 

N 
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We shall work with the variance <r 2 . The true value of the population 
variance <7 2 is unknown, but it can be estimated from the sample data. 
If the sample at hand is large, a good estimate (J 2 ) of <r 2 is the sample 
variance s 2 , which is given by 



If the sample is small, we shall later show in part that the appropriate 
estimate 5 2 is given by 






the divisor representing the number of independent values of d (degrees 
of freedom) . Thus from n values of d one constant d has been calculated ; 
hence, given this value of rf, only n 1 values of d are unfixed, or inde- 
pendent. [3] is always better than [2] but for samples with n > 30 the 
difference may be neglected. We write 



the last two terms of the above can be combined and we have 



n 



where n is the number of observations. The last form is the most con- 
venient for the purposes of calculation. 
In the present example 

a 2 = 121.571 

1.15 The u test. The appropriate tests of the difference of two 
means may now be described in greater detail. Assume that we are 
given a normal population of known variance a 2 and mean 3'; what is 
the distribution of the means of random samples each of n observations? 

First, consider a population defined only by df = 0. If from such a 
population a very large number k of samples each of size n are drawn, 
and the fractional frequencies or probabilities gi/k of their means 3; are 
calculated, a distribution of the frequency of the various means can be 
plotted. It will be apparent to the reader that (1 ) the maximum f re' 
quency of this distribution occurs at 3 = 3'(= 0), (2) the distribution 
of sample means has smaller variance than the parent population, 
(3) the variance of the distribution of means is smaller for large than 
for small n\ and (4) if the population is symmetrical about d! (as is the 
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Probability 



normal distribution) the distribution of sample means will be symmetri- 
cal about d = df . 

In support of (2) and (3) it will later be proved that if the variance 
of the population is cr 2 , the variance of the distribution of means is <r 2 /n. 
In connection with (4) it will be shown that if the population is normal 

the distribution of the sample 
means will also be normal. 

If the fractional frequency or 
probability of random samples 
with means greater than the mean 
of our sample d (and less than 3) 
is low, say less than 5 per cent, 
the hypothesis that our sample 
was a random sample from a nor- 
mal population of mean 3' = 
and variance a 2 is hardly tenable. 
If this is found and if (1) our 
sample is random and (2) the variance of the population is a- 2 and (3) 
the population is normal, it follows that df ^ 0. The means of the 
two materials are significantly different. 

This test of differences of means, which will be called the u test, is thus 
based on the following theorem: Given a normal population of mean df 
and variance o- 2 , the means of random samples each of n observations 
will be distributed normally with mean df and variance a 2 /n. This may 
also be expressed as follows: given a normal population of mean df and 

is normally distributed with 



Normal distribution of 
means of variance (7 



Normal population 
of variance CT 2 




variance cr 2 , the statistic u = ( = 

mean df and variance unity. 

1.16 The t test. In our case the variance of the normal population 
is unknown; it must be estimated from a small sample, and the u test 
must be modified. The statistic 



t = 



d 



v/Vn 



will be distributed symmetrically with mean c? 7 , but the distribution 
will be somewhat more peaked and will, in general, have a wider range 
than the normal distribution, with its shape depending on the number 
of independent observations (called degrees of freedom) from which 
the estimate 2 is calculated. The test of significance is called the t test, 
and the peaked distribution is known as " Student's " distribution. 
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In our example the population of differences is normal; also 3f = 0, 
by hypothesis, and cr 2 = 121.571; the variance cr| of the distribution of 
means is 2 /n = 8.105 and the standard deviation (standard error) of 
the distribution of means is Vg.lOS = 2.847. The difference d 3' 
is 0.008 inch. We express this difference in terms of a measure of its 
error such as a d ; this division of one linear function (d d!) by another, 



JE(d 
\ n(n 



, eliminates the effect of the units of the difference and 

n(n 1) 

permits the use of a single set of tables for all problems. The differ- 
ence of 0.008 inch is 8/2.847 or 2.81 standard error units. 

The deviation is 2.81 standard error units, and the estimate of the 
variance is based on 14 degrees of freedom. From Table V the proba- 
bility of exceeding by chance a deviation of 2.81 standard error units is 
only 0.015, approximately. Thus the two kinds of pipe are significantly 
different in their rates of corrosion. If the two types differ only 
in one characteristic, say method of manufacture or inclusion or 
exclusion of slag, this factor can be held responsible for the difference 
in quality. 

1.17 Analysis of unpaired variates. If kind of soil and length of 
burial do not affect corrosion, it is disadvantageous to consider specimens 
to be paired with respect to these factors, for the variates di will be no 
less variable than X l and F;, and there are only n variates di in place of 
2n variates Xi and F;. If kind of soil and length of burial affect corro- 
sion, pairing will likely be advantageous. To determine the gain or loss 
resulting from pairing, the variates, considered as unpaired, must be 
analyzed. _ _ 

Given a normal population of mean X f Y f = and of variance a- 2 ; 
assume that from this population two random samples are drawn, of 
size nx and ny, and that the difference of their means is X P. If a 
large number k of such dual drawings are made, the resulting k values 
of X P may be grouped into a frequency distribution. It should be 
apparent that (1) this distribution of means will center about 
jf' p' = Q; (2) the most frequently occurring value of X P will be 
zero; (3) the distribution will be symmetrical about X.' P' = 0; 

(4) it will probably have smaller variance than the population, and 

(5) its variance is small when nx and ny are large. It will later be 
proved that this distribution of the difference of means is normal with 
variance 
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One important difference between the analysis of paired and unpaired 
variates lies in the estimate of the population variance. In the case of 
paired variates, we have a single sample and the estimate S 2 is given by 



n 1 n 1 

With unpaired variates the estimate a 2 will be shown to be 



nx + n Y 2 

Note that the estimate <r 2 from a single sample of differences is based 
on n 1 independent differences whereas if the original variates are 
not, or are considered not to be, paired, the estimate is based on n x + 
UY 2 independent variates. 

We have, for the unpaired variates 

a' = o 

d = 8 
n = 15 
S 2 = 125.029 



- + - ) = 16.671 
,n nj 

and the standard error of the difference of means is 




The deviation in standard error units is 

8 



4.08 



= 1.96 



which for 28 degrees of freedom is not significant, for P is greater than 
0.05, whereas in the analysis of the paired variates, the difference was 
significant. In this example the gain in sensitivity from pairing out- 
weighed the loss of half of the degrees of freedom, and the testimony of 
the paired variates may be accepted. This gain in sensitivity resulted 
from the exclusion, by pairing, of the effects of factors which affected 
both samples. 



THE DIFFERENCE OF TWO MEANS 17 

1.18 Equality of the variances. In testing the significance of the 
difference of the means of unpaired variates; the statistic t was com- 
puted, where 



t 

fc(X-X) 2 + E(Y- F) 2 /l , 1\ 
\ n x + n Y - 2 \n x n Y ) 



This might have been written 
t 



+ n Y s Y /J_ JA 
n Y - 2\nx n Y ) 



where s|- and s 2 Y are the sample variances. The test of the means of 
unpaired variates results in the acceptance or rejection of the hypothe- 
sis that the two normal populations have the same mean X f = P' 
and the same variance a\ = G\. If a\ ^ v\ any inference regarding 
the validity of the hypothesis X f = Y' is open to question, for a large 
value of t may reflect differences in variances rather than differences in 
means. To test the hypothesis a\ = <*\ we compute from the two 
samples the value of the statistic LI which for k samples is given by 



where 

4-lEa? 

If the variances are identical, sf = si = * ' = $a , then LI = 1. 
The distribution of L\ for random samples from a normal popula- 
tion has been approximated and Table X shows values of LI, for samples 
of various sizes, beyond which 5 per cent and 1 per cent of all values of LI 
lie. Note that LI always lies between 1 and 0. We have 

s\ = 125.09 
si = 108.29 
si = 116.69 
from which 

Li = 0.997 

This is far above the 5 per cent level of LI shown in Table X (the 5 
per cent level of LI is 0.8673) ; accordingly the variances of the two 
normal populations from which these samples were drawn are not sig- 
nificantly different. 
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1.19 Other tests of significance of the difference of two means. 

If the population is definitely not normal, it is necessary to use a test 
not assuming normality. One such test has been given by Wald and 
Wolfowitz (45). If the LI test indicates that the variances differ 
significantly, a test of the hypothesis X f = P' has been proposed 
by Fisher and Behrens (Sukhatme, 40, for examples, tables). Often 
in experimental work, both normality and equality of variances will 
be found or can be assumed, and in such instances Student's t test, 
which incorporates normality and equal variances into its hypothesis, 
should be used; the t test (or for large samples, the u test) under 
these conditions will be more sensitive than a test which is designed to be 
valid for more general conditions. 

1.20 Further examples. Beckwith (2) gives the following data for 
tuft bind tests on each of two rugs. The values are unpaired; only 
one test of significance is available. 

RUG No. 1 RUG No. 2 

10.0 10.5 

10.5 9.5 

9.5 8.5 

18.5 9.0 

14.0 8.5 

14.0 12.0 

12.0 8.0 

9.5 10.5 

12.5 7.0 

10.0 10.5 

Are the population means significantly different? The test already 
used may be summarized as follows: If a large number of pairs of small 
samples of size nx and ny respectively are drawn at random from a 
normal population of mean d f = X' Y f and variance o- 2 , the quan- 
tity 5/<7a is distributed as Student's t with nx + ny 2 degrees of 
freedom, where 

3- JP-P 



and the best estimate ? 2 of the unknown variance o 2 is 

) 2 + (7- F) 2 
nx +n Y 2 
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We have 

c 2 - 5.174 
ff = 2.275 



*- 2.275 




10 



For = 2.605 with 18 degrees of freedom we find from Table V that 
P < 0.02. The means are significantly different. 

The following data showing the results of field tests on the corrosion 
of non-bituminous pipe coatings for underground use have been given 
by Logan and Ewing (25). 

LEAD COATED 

SOIL TYPE STEEL PIPE BAKE STEEL PIPE 

A 27.3 41.4 

B 18.4 18.9 

C 11.9 21.7 

D 28.7 9.8 

E 11.3 16.8 

F 14.8 9.0 

G 20.8 19.3 

H 21.6 11.1 

I 17.9 32.1 

J 7.8 7.4 

K 18.6 68.3 

L 14.7 20.7 

M 19.0 34.4 

N 65.3 76.2 

Do these two types of pipe differ significantly in their resistance to 
corrosion? 
Analysis of the unpaired variates yields 

t = -0.931 

which for 26 degrees of freedom is not significant. 

In this example there is some evidence that the data in any one row 
are not independent. Type of soil is probably responsible for any 
such lack of independence; for example, soil N appears to be highly 
corrosive to both types of coatings whereas soil J has slight effect regard- 
less of covering; this " positive " correlation is, however, not in 
evidence in all pairs. Analysis of the paired variates gives the results 
shown on page 20. 
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DIFFERENCE IN 


SOIL TYPE 


PENETRATION 


A 


-14.1 


B 


- 0.5 


C 


- 9.8 


D 


+18.9 


E 


- 5.5 


F 


+ 5.8 


G 


+ 1.5 


H 


+10.5 


I 


-14.2 


J 


+ 0.4 


K 


-49.7 


L 


- 6.0 


M 


-15.4 


N 


-10.9 




Mean = - 6.357 



and 



t = -1.487 



From Table V, for 13 degrees of freedom and t = 1.487, we have 
P = 0.17, approximately. The value P =0.17 is well above the critical 
level, 0.05. Both tests indicate that one type of pipe is not more liable 
to corrosion than the other. 

Fieldner and Selvig (13) give the following data on the ash content of 
dry coal. Each pair of samples came from a different coal supply. 



Sample A 


Sample B 


Sample A 


Sample B 


Sample A 


Sample B 


8.91 


9.02 


13.04 


13.08 


12.82 


12.79 


11.47 


11.36 


12.75 


12.23 


9.87 


9.69 


9.81 


10.63 


11.52 


11.65 


8.85 


9.22 


9.34 


9.44 


10.03 


10.21 


10.49 


10.58 


9.73 


9.88 


10.75 


10.06 


9.16 


9.39 


10.22 


10.03 


9.77 


10.16 


11.35 


11.72 


8.35 


10.26 


11.90 


12.11 


12.29 


12.43 


10.19 


10.20 


13.66 


13.08 


7.95 


7.44 


11.49 


11.45 


12.94 


13.12 


9.14 


9.77 


13.20 


12.95 


12.36 


12.83 


9.32 


10.01 


13.73 


14.42 


5.89 


5.75 


4.16 


4.08 


11.51 


11.21 


6.22 


5.99 


8.41 


8.72 


10.60 


10.60 


5.27 


5.36 


5.70 


6.01 


11.11 


10.94 


5.69 


5.91 


4.43 


4.40 


10.39 


10.05 


5.47 


5.33 


4.69 


4.52 


10.59 


11.20 


5.05 


4.93 


4.51 


4.50 


9.88 


9.87 


5.25 


5.37 


3.42 


3.32 


11.18 


11.51 


12.66 


13.01 


3.87 


3.77 


10.58 


11.27 


12.12 


12.56 


4.25 


4.06 
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It is clear that the pairs of values are positively correlated, coal source 
being the control. The samples each weigh 3 pounds and are prepared 
in identical fashion, so any differences between sample A and sample B 
are expected to be negligibly slight. Do the data support this expec- 
tation? 

We form differences: 



Sample A Sample B 


Sample A Sample B 


Sample A Sample B 


-0.11 


-0.04 


+0.03 


+0.11 


+0.52 


+0.18 


-0.82 


-0.13 


-0.37 


-0.10 


-0.18 


-0.09 


-0.15 


+0.69 


-0.23 


+0.19 


-0.39 


-0.37 


-1.91 


-0.21 


-0.14 


-0.01 


+0.58 


+0.51 


+0.04 


-0.18 


-0.63 


+0.25 


-0.47 


-0.69 


-0.69 


+0.14 


+0.08 


+0.30 


+0.23 


-0.31 


0.00 


-0.09 


-0.31 


+0.17 


-0.22 


+0.03 


+0.34 


+0.14 


+0.17 


-0.61 


+0.12 


+0.01 


+0.01 


-0.12 


+0.10 


-0.33 


-0.35 


+0.10 


-0.69 


-0.44 


+0.19 



We then find 



t = -1.972 



From Table V, for t = 1.972 and 56 degrees of freedom P is slightly 
below 0.05. The sample means may be considered significantly differ- 
ent, although the margin is slight. We conclude that experimental 
technique is subject to improvement, or that the samples A and B 
differ with respect to a non-randomized factor. 

If the exceptional deviate d = 1.91 (the seventh value) is omitted, 
the result is 

*= -1.679 

which, for 55 degrees of freedom gives P > 0.05 and the difference in 
mean ash content is judged not to be significant. Omission of an 
observation or observations is an unsound practice and should be done 
only when the investigator has reason to believe that the observation 
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in question was subject to special influences not affecting the remain- 
ing observations. 

1.21 Large samples, paired variates. The estimate from small 
samples of the population variance is subject to error, and the amount 
of the error depends on the size of the samples. Hence in determining 
the probability P that a given value of t = d/fr^ could be exceeded in 
random sampling from a normal population we must take into account 
the number of independent observations (degrees of freedom) on which 
the estimate of the population variance is based. Accordingly, the 
probabilities of exceeding t given in Table V depend on the number of 
degrees of freedom. 

If the experiment is relatively large, with, say, more than 30 obser- 
vations in each sample, the population variance can be assumed to be 
given exactly by the sample variance, and the distribution of t passes 
into the normal distribution of u (areas under which are given in Table 
IV). The probability P that a given value of u = 5/cr 5 could be ex- 
ceeded does not involve the concept of degrees of freedom; this is 
evidenced by the absence of degrees of freedom in Table IV. The 
normal approximation to t is completely valid only if the sample size n 
is infinitely large; only the probabilities associated with the infinite 
sample sizes shown in the last row of Table V will correspond to those 
of Table IV. For example, in Table V for n = oo and t = 1.96, we find 
P = 0.05, as indicated in the illustration at the left. 




From Table IV for u = 1.96 we find a value of 0.475; the area of the 
two tails is 0.05 as before; this is shown in the illustration at the right. 
If n > 30, the normal values given in Table IV may be safely used. 

Let us reanalyze the data on the ash content of coal, now considering 
the sample of d l to be large. We have 

a'-o 

d = 0.1079 

n = 57 
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^ = P = 0.168 and a = 0.41 
57 



cr d = - = 0.054 

V57 

d 0.1079 

u = = - = -1.99 
<r a 0.054 

From Table IV, P = 0.0466. The difference is judged barely signifi- 
cant, as before. 

1.22 Large samples, unpaired variates. If the observations in two 
small samples are, or are considered to be, unpaired, the estimate J 2 of 
the variance of the normal population is 

Z) 2 + E(F - P) 2 _ /nxsj + n Y s 2 Y \ 
n x +n Y - 2 \n x + n Y - 2/ 

and the distribution of the statistic 



fl 1 

\ ' 



is known but is not normal. If, however, the two samples are large, 
say HX > 30, ny > 30, the population variance may be assumed to be 
known and to be given by the weighted mean of sample variances, i.e., 

r 4 i ^2 



and the statistic t, which we have called u under these conditions, may 
be written 

d d 

u = = 



where <7 is the square root of [4] ; u is distributed normally. 

1.23 Examples of 1.22. The British Cotton Industry Research 
Association (5) records the following results of breaking load tests on 
two types of yarn: 



TYPE OP 


MEAN BREAKING 


STANDARD 


NUMBER IN 


YARN 


LOAD IN OUNCES 


DEVIATION 


SAMPLE 


X 


6.83 


1.23 


1782 


Y 


7.48 


1.33 


1914 
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Do the yarns differ significantly in their mean values? We have 

^2 _- L-Z f 1 1 = 1 

n x + n Y \n x ny/ MY ^x 

_ (1.23) 2 (1.33) 2 
1914 1782 

= 0.001783 

d 0.65 



<r a 0.04223 



= 15.39 



This deviation is so improbable that it cannot be located in Table IV. 
Hence the yarns differ significantly in their means. The difference is, 
however, only 0.65 ounce and may be of slight practical significance. 

Van Rest (43) gives the following data and calculations on the effect 
of stain (outdoor storage) on the hardness and bending strength of 
wood. 

HARDNESS BENDING STRENGTH 

Stained Unstained Stained Unstained 

Number of tests 40 100 40 100 

Mean 117 132 6,184 6,270 

Sum of squares about mean 8,655 27/244 16,799,390 30,459,499 

Are hardness and bending strength significantly affected by stain? 
Previous formulae yield 

2 

0"3 = 

HX n Y 
we obtain: 

Hardness a d = 2.996 and =5.007 

a 

d 

Bending strength <? d = 108.7 and = 0.791 

** 

Hardness is really affected by stain, whereas bending strength is not, 
for the probabilities from Table IV are respectively P = 0.0000003 
(highly significant) and P = 0.2148 (not significant). 

1.24 Examples in which the hypothetical mean is not zero. Fre- 
quently in industrial practice we may want to use as the population 
mean the mean of a large number of observations of an earlier date or 
a figure set by a standards-making body. The practical problem is to 
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determine whether or not the mean of the current sample differs signi- 
ficantly from such a population mean. 

The first illustration deals with small samples. Beckwith (2) gives 
the following data on the pile wool content, in ounces per three-quarters 
of a yard, of a fabric. 

26.0 
27.2 
26.5 

26.8 
27.0 

Quality specifications require a mean of 27.4. Are the data of this 
small sample compatible with the hypothesis that the mean of the 
population from which the sample was drawn is 27.4? 

The appropriate method of analysis which has already been used 
is summarized as follows: If a quality characteristic X is normally dis- 

X - X f 
tributed with mean X' and unknown variance a 2 , the quantity - 

*x 
is distributed as Student's t with n I degrees of freedom, where 

<TX = Vo, and 

,2 * 2 
ffx = 

n 

and the best estimate a 1 of the unknown variance a- 2 from a single 
sample is 

(* - X) 2 

n - 1 

A test of normality of the population would have to be based on five 
observations; it will not be attempted. We have 



_ 3 . 34 



For n 1, i.e., for four degrees of freedom and for t = 3.34 we find 
P = 0.015 (15 samples in 1000). As this is a very low probability, the 
material at hand must be considered significantly different from the 
specification in its mean. 

The following example illustrates the case for large samples : Pettebone 
and Young (32) record the following 1306 readings on the heat value in 
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Btu of a mixed gas. 
January 1937. 

BTU 

548.5-550.5 
546.5-548.5 
544.5-546.5 
542.5-544.5 
540.5-542.5 
538.5-540.5 
536.5-538.5 
534.5-536.5 
532.5-534.5 
530.5-532.5 
528.5-530.5 
526.5-528.5 
524.5-526.5 
522.5-524.5 
520.5-522.5 



The data cover a period from January 1932 to 



MIDPOINTS 

549.5 
547.5 
545.5 
543.5 
541.5 
539.5 
,537.5 
535.5 
533.5 
531.5 
529.5 
527.5 
525.5 
523.5 
521.5 



NUMBER OF DAYS 

6 

3 

6 

30 

57 

118 

202 

260 

284 

197 

103 

36 

3 



1 



1306 



On 64 days at irregular intervals in the 5-year period, state inspection 
was conducted. The 64 observations which constituted an apparently 
random sample from the population of 1306 observations are given in the 
following table. 



BTU 

(midpoints) 

549.5 
547.5 
545.5 
543.5 
541.5 
539.5 
537.5 
535.5 
533.5 
531.5 
529.5 
527.5 
525.5 
523.5 
521.5 



NUMBER OF DAYS 

1 

1 

3 

3 

5 
10 
11 

9 

8 

6 

5 



1 





64 



So far as means are concerned, is it likely that this constitutes a 
random sample from the given population? 

The appropriate procedure is summarized as follows: If a quality 
characteristic X is distributed normally with mean %' and variance <r 2 , 
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the means of random^samples each of n observations will be distributed 
normally with mean X = X' and variance a- 2 /n. 

In the present example a test of the normality of the population is 
hardly necessary, for it will presently be noted that if a sample is larger 
than 50 and if the population is as much as 10 times as large as the 
sample, the tendency to normality of the distribution of the means of 
random samples is negligibly affected by the nature of the population. 
If a test is to be applied, the statistics a and Vft^ or 6 2 and Vb~i are 
computed. When testing the normality of the parent population 
from a small sample, the statistic a is better than & 2 In the present 
example 1306 observations are available, and we shall use the more 
familiar b% test. For a normal population 



= 3 



L 

The distribution of 



n 



rL(x-l) 2 T 

L n J 



is known and the 5 per cent and 1 per cent values are given in Table II. 
For our data 

Vbl = 0.43 
6 2 = 3.58 

which indicate that the population is not normal, for both tests yield 
probabilities of less than 0.01. Normality of the means can, however, 
be assumed. We have 

X' = 534.99 
a = 3.85 

X = 536.72 
= 64 
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(7 3? ~"~"~ 

x n 64 

<r x - 0.481 



To enter Table IV, form 



X - X' 1.73 
u = - = - = +3.60 
a* 0.483 



From Table IV we find that only 4 times in 10,000 trials would 3.60 
be exceeded if chance alone is responsible for the deviation. This is a 
very small probability; therefore the mean of the inspector's readings 
departs significantly from the population mean, and reasons for this fact 
should be sought. 

The problem and its solution is shown in the following illustration. 
Each shaded area P is 0.0002. 




Means of random 
samples each of 
size n 64 



A smoothed distribution 
of 1306 observations 



Heat content iixBta 
K 8.60 (Ty 



NOTES 

1 .25 The Vb\ and 62 tests for normality, in detaU. The following example 
illustrates in detail the V&i and 6 2 tests for normality. In connection with the 
example we shall show certain short-cut methods of calculating the mean and 
the variance. 

Pulsifer (33) gives the following data on the tensile strength, in actual load 
pounds, of 1000 cap screws of a certain dimension. 



THE DIFFERENCE OF TWO MEANS 



29 



Tensile 
strength 
in pounds 


Number 
of 
screws 


Tensile 
strength 
in pounds 


Number 
of 
screws 


15,500 


1 


17,800 


41 


600 


, . 


900 


29 


700 


6 


18,000 


42 


800 


8 


100 


49 


900 


4 


200 


48 


16,000 


11 


300 


41 


100 


6 


400 


33 


200 


15 


500 


48 


300 


11 


600 


52 


400 


18 


700 


28 


500 


5 


800 


48 


600 


10 


900 


27 


700 


19 


19,000 


35 


800 


23 


100 


25 


900 


19 


200 


15 


17,000 


20 


300 


15 


,100 


23 


400 


8 


200 


36 


500 


3 


300 


33 


600 


3 


400 


35 


700 


1 


500 


31 


800 


2 


600 


33 


900 


. . 


700 


39 


20,000 


1 








1000 



Is the population of tensile strengths normally distributed? 
The normal population distribution is given by 



15] 



-K^) 2 



where y is the fractional frequency I ) or probability of 

\total frequency N/ 

screws with tensile strength X, X' is the mean tensile strength, and a is the 
standard deviation. For [5] the values of V^ and 2 for AT oo are respec- 
tively and 3, where 



and 
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Vk being defined as the fcth moment of the deviation x = X j?', i.e., 



Estimates of X', <r, Vft and 2 from the data of large samples are 

n 

X (_>l / )= 



n 



where 



and n is the number of observations in the large sample. 

We regroup the data to facilitate computation, although a minor grouping 
error is thereby introduced and certain information is lost. A correction will 
later be introduced which will partially remove this error. 

TENSILE STRENGTH NUMBER OF 

CLASS MIDPOINTS SCREWS 

15,500 1 

800 18 

16,100 32 

400 34 

700 52 

17,000 62 

300 104 

600 103 

900 112 

18,200 138 

500 133 

800 103 

19,100 75 

400 26 

700 6 

20,000 _ 1 

1000 

To compute the moments MA it is convenient to substitute for X a new 
variable Z: 

X = a + cZ 
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where a is an arbitrary constant and c is the interval between class midpoints; 
in our case c 300. Summing this expression over all observations and 
dividing by n we obtain 

X = a + cD 



where D = HfZ/n, / being the frequency of X (or Z). By substituting these 
relations into the equation for /x& we obtain by simple algebraic expansion the 
following equations, which are more convenient for purposes of calculation: 



2nZ> 3 ) 



- 3nZ) 4 ) 



Notice that the class interval c will disappear IR calculating V&i and 6 2 - 
The computations are carried out in the following table. The last column, 
suggested by Charlier, is for check purposes, for 

I) 4 



X 


/ 


Z 












Tensile 


Number 


Deviations 












Strength 


of 


in Class 












Class 


Screws 


Interval 












Midpoints 




Units from 
















a = 17,900 


fZ 


fZ* 


/Z 3 


/Z 4 


/(Z + D 4 


15,500 


1 


-8 


-8 


64 


-512 


4,096 


2,401 


800 


18 


-7 


-126 


882 


-6,174 


43,218 


23,328 


16,100 


32 


-6 


-192 


1,152 


-6,912 


41,472 


20,000 


400 


34 


-5 


-170 


850 


-4,250 


21,250 


8,704 


700 


52 


-4 


-208 


832 


-3,328 


13,312 


4,212 


17,000 


62 


-3 


-186 


558 


-1,674 


5,022 


992 


300 


104 


-2 


-208 


416 


-832 


1,664 


104 


600 


103 


-1 


-103 


103 


-103 


103 




900 


112 













112 


18,200 


138 


+1 


138 


138 


138 


138 


2,208 


500 


133 


+2 


266 


532 


1,064 


2,128 


10,773 


800 


103 


+3 


309 


927 


2,781 


8,343 


26,368 


19,100 


75 


+4 


300 


1,200 


4,800 


19,200 


46,875 


400 


26 


+5 


130 


650 


3,250 


16,250 


33,696 


700 


6 


+6 


36 


216 


1,296 


7,776 


14,406 


20,000 


1 


+7 


7 


49 


343 


2,401 


4,096 




1000 




-1,201 


8,569 


-23,785 


186,373 


198,275 








+1,186 




+13,672 












-15 




-10,113 
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The Charlier check indicates that the calculations have probably been made 
correctly, for 

198,275 = 186,373 + 4(-10,113) + 6(8569) + 4(-15) + 1000 
= 198,275 



The moments are 
8569 - 1000 



(300) 2 = 8.568775 (300) 2 



1000 
-10,113 - 3 ( J (8569) + 2(1000) ( Y 

' yiooo/ v yiooo/ 



= -9.727402 (300) 3 

M4 = 

- ~ o 



186 - 373 - 4 ( ~ 10 ' 113) + 6 (ioFo (8569) 



(300)4 



1000 

= 185.777788 (300) 4 

from which 

Vh = 0.39 
62 = 2.53 

Are these values sufficiently close to and 3? From Table I we find that 
fewer than one in one hundred samples of size n = 1000 would have v 61 
further than 0.39 from 0. For 6 2 = 2.53, this probability is again <0.01, as 
shown in Table II. Hence the present sample cannot be assumed to have 
been randomly drawn from a normal population. 

1.26 Correction of the moments. In estimating V/3i and fa the moments 
Hk may be corrected for errors resulting from grouping the original observations 
into classes. The error arises from the fact that we wish to estimate the values 
of V j3i and 2 of a continuous curve whereas our data form a discontinuous 
curve. The adjustments most generally applicable are those due to Sheppard 
(48). These adjustments assume that the corrected distribution has high 
order contact (very gradual tapering) with the X axis at its extremities. 

Hz (corrected) = ju 2 xVc 2 
/i 3 (corrected) = /* 3 
M4 (corrected) = /z 4 \^C L + "sdhrc 4 
For our data 

M 2 (corrected) = 8.485442 (300) 2 
// 3 (corrected) = -9.727402 (300) 3 
M4 (corrected) = 181.522567 (300) 4 



and 
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0.39, 6 2 = 2.52 
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which yield the same conclusions. 

The given data and the closest approximating normal distribution are shown 
in the following figure (actual fitting of a normal distribution to industrial 
data as a supplement to the V 61 and b% test used here is seldom fruitful and 
the technique is not discussed here). 



|0.16. 

|o.io 

0.08 

M 

-a o.oe 

lo.04- 

iK 0.02 


x i 

Normal curve ^y^ 


^ 


\ 


\ 


G 

\ 


ivcn data 

V, 








f\ 










x 


X 



155 158 161 164 167 170 173 176 179 182 185 188 191 194 197 200 
Tensile strength in 10 2 pounds 



1.27 Some properties of the normal distribution. The X-variate of the 
normal distribution extends from oo to + <*> ; tensile strength, however, 
could not fall below zero. No serious error is introduced by this discrepancy. 

In addition to properties already given, certain others may be noted. Writ- 
ing x = X X' we find, for a population of size N 



N 



-000V27T 



= N 



that is, the area is N, the total frequency of observations. If we use the 
fractional frequency, i.e., probability y instead of / = yN, we find 



1 



dx = 



The probability of x falling between db <*> is unity; the probability of x falling 
between x\ and xz is given by 



f 

*^X1 



1 
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Areas under this normal distribution (with abscissa x/o) are given in Table IV. 
In that table x\ is taken at zero. 
The points of inflection for the normal curve occur at x - zto-. We have 



X 1 



dx 
dx 2 



from which 



x = 



Other properties of the normal distribution are discussed later. 

1.28 Moments. While the moments Vk about the mean are used in this 
chapter in connection with normality, their definition is general. For a dis- 
continuous distribution of n observations with which we are always faced 
in practice the Mh moment about the mean has been defined by 



For a continuous population the corresponding definition of the kth moment 

Pk about the mean is 




where 



/, 
yx* dx 
.00 



x = x - y 



1.29 That Vft = and ft = 3 for 
a normal distribution. A simple proof 
has been given by Bowley (3). The odd moments for the normal distribu- 
tion are all zero; hence V ft = 0. To prove, let *> 2 <+i be any odd moment. 
Then 







r2W. 



1 



0V27T 



<p(x)dx 



= f <p(x)dx + I <p(x)dx 

/0 t/~oo 

In the last term of the above substitute x' = x. The limits become oo and 0. 
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We obtain 



= / <f>(x)dx - I <p(x 

c v(~x' 

JQ 



for <p(x) =s ^>( x') in the function 



(7V27T 

This result would be obtained for any symmetrical distribution. Finally 



for non-zero v 2 . 

To determine the value of & for a normal distribution, first consider 1/4. 



X, / v wv 

(JV27T 

The solution is of tne form 

Ffil /yS/, X 2 /2flr2 

[uj u/ e 

for, after inclusion of the appropriate constants, the derivative of [6] yields the 
fourth and second moments. 

Omitting constants, we have for the derivative 



Including the constants, we find 



or 



a ^4 

or P2 = -5 
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1.30 Short method of computing moments. Consider &. We have 

I) 2 



n 

Make the substitutions X = a + cZ and X = a + cD where 

y^y / YVZ \ 

2) _ ^_ j = tL- w hen the data are grouped j 
n \ n I 

We obtain 

E(Z - I) 2 (o + cZ - a - cD) 2 
M2 = = " 



a form which facilitates rapid calculation. Similar forms have been given for 
Ms and M4. _ 

1.31 Distributions of Vbi and b 2 . In order to determine whether or not 
the computed values of V^ and 6 2 differ significantly from the normal popu- 
lation values V& = and ft = 3, we need to be able to answer this question: 
If a large number of random samples each of size n are drawn from a population 
known to be normal and if the statistics Vfo and 6 2 are computed for each 
sample, what will be the distribution curve of Vfo and that of W The answer 
is not yet exactly known but the moments of the two distributions have been 
given by R. A. Fisher (15, 6); on the basis of Fisher's results, E. Pearson and 
finally Geary and Pearson (20) have constructed approximate tables for various 
values of n; they have done similar work on a. 

For very large n, the distributions of Vfo and 6 2 approach normality with 
standard deviations (usually called standard errors) respectively of v6/ri 
and V24/n, approximately. For n < 1000, the values given in Table I are 
to be preferred to normal approximations^ 

Assume that for n = 300 we have Vfo = -0.230. If this value is judged 
by reference to the areas given in Table I, we conclude that there are five 
chances in 100 that Vfc = -0.230 could have been exceeded (in a negative 
direction) in random sampling from a normal population. How doesjbhis 
compare with the normal approximation? The standard error of V bi is 
Ve/300 = 0.1414. Our deviation from the origin Vfo = v ft = is 0,230 
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or 1.626 standard error units; from the normal table the area to the left of 
1.6260V&; is 0.0520. Thus at this sample size and at this probability level the 
normal approximation to the distribution of V 61 results in a slight overesti- 
mate of the proportion of samples with V bi to the left of 0.230. 



V 61 distribution 



62 distribution 



Geary-Pearson 
approximation 




Normal 
approximation 

5% 




Geary-Pearson 
Approximation 



Normal 
approximation 



o* 
I 



I 



Similar considerations hold for & 2 , which is a test for flatness. For n 300, 
62 = 2.59 (less peaked than the normal curve) is significantly different from 
3.00 at the 5 per cent level (i.e., there is 1 chance in 20 that a sample of 300 
observations drawn at random from a normal population (/3 2 = 3) would 
have a value of 6 of 2.59 or less). Judged by the normal approximation 
6 2 = 2.59 is not quite significantly different from 3, for the area to the left of 
2.59 will be found to be greater than 0.05. 

In both instances the normal approximations lead to an overestimate of 
the proportion of large differences; in the case of b 2 the error is likely to 
be more serious, for while the Geary-Pearson approximation to the distribu- 
tion of V&i is quite similar to a normal curve, the 6 2 approximation differs 
appreciably. 

In a normal population V^ and j3 2 are independent of each other; hence 
they constitute independent tests of normality and for a large sample to be 
considered normal, both should be satisfied. 

1.32 Outline of a derivation of the normal distribution. The importance 
of the normal distribution in sampling theory is evident. This distribution may 
originate in the following way: if a large number of independent causes, each 
producing a slight effect, affect a quality characteristic, values of the latter 
will, under certain conditions, be normally distributed. A derivation from 
Whittaker and Robinson (47) will be outlined. 

The strength of cap screws varies from one screw to another. In other 
words, each shows a deviation from the average. This deviation will be 
assumed to be the effect of a large number of small deviations, the latter caused 
by the operation of a large number of independent causes, each of which has 
but a small effect. 

Let the small deviations be 

di, * ' ' , d n 
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with the total effect or total deviation being 

di + + d n 
or more generally 

[7] Widi H h W n d n 



where W% are weights. What is the probability that for a given observation 

(strength of cap screw) this deviation 
will lie between a\ and #2? The prob- 
ability that d r lies between x and 
x + dx is 




The probability that d r lies between 
d\ and d + ddi = <p\(d\)dd\] and the 
probability that d r lies between d% and d + dd% = ^2(^2)^2, etc. 

The probability of the concurrence of these deviations, if they are inde- 
pendent, is given by 



Therefore the probability that [7] lies between a.\ and a.% is the integral of 
[8] over all values satisfying 



ai < TF^i + + W n d n < 0:2 
The integration leads to 



1 /* 

(3) = _- I 

27T t/ o 



where the semi-invariants /2, /s, , / n are simple functions of the moments 
^2, *% * *, Vn- If /2 is finite and if most of the deviations d\ d n are of the 
same order of magnitude, the higher semi-invariants 1 3 / will generally 
be small in comparison with /2. Hence 



V27T/2 



1.33 Normality of the distribution of means. Various investigations indi- 
cate that the distribution of means of random samples is approximately 
normal even when the samples are drawn from decidedly non-normal popu- 
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lations. For example, Shewhart (36) obtains the following striking results 
from drawing 1000 samples each of 4 items from rectangular and triangular 
populations. Normal curves have been fitted to the distributions of means. 




J200 r 



; 160 



100 

o 
I 60 




Population 



-2.4 -1.2 
JEtectangular population 



1.2 



2-1 
X 




200 
150 



50 




Population. 



-1.2-0.8-0.4 
Kigtot triangular population 



0.4 0.8 1.2 L6 
X 



Carver's students (11) have considered a population of the following non- 
normal character. 



X 
3 
15 

29 

405 

1710 



Frequency 

2 

9 

43 

189 

37 



They found the distribution of 1000 means of random samples each of size 25 
to be 



X 


Frequency 


200 


2 


280 


54 


360 


203 


440 


310 


520 


254 


600 


130 


680 


36 


760 


9 


840 


2 



1000 
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Carver concluded from this and other results that if the sample size is 50 or 
more and the parent population is at least 10 times as large as the sample, the 
shape of the parent population has relatively slight control over the shape 
of the curve of means. 

Exact and near-exact distributions of means have been found for various 
specific, non-normal populations and, in nearly all cases, the approach of the 
means to normality, even for low n, is evident. Thus, for a rectangular popu- 
lation Bietz and separately Irwin have found that the distribution of means 
rapidly approaches normality. This agrees with Shewhart's experimental 
results which have already been mentioned. The distribution of the means 
of samples drawn from a moderately skewed population known as Pearson's 
Type III has been found separately by Irwin, Church, and C. C. Craig. The 
result is another Type III distribution which rapidly approaches normality 
even for n < 50. 

Using Craig's methods, Ness (29) found similar results for another non- 
normal population, Pearson's Type X. Baker and later Craig found dis- 
tributions from still other non-normal populations; their results support 
the opinions stated above. The extensive literature on sampling from non- 
normal populations has been summarized by Rictz (35) and by Rider (34, 6) ; 
their articles contain references to the mathematical work cited above with 
the exception of the unpublished results of Ness. 

The proof of the normality of the curve of means when the samples are 
drawn from a normal population and the proof that the variance of the mean 
is given by 2 /n will now be given. 

1.34 Normality of the mean and the difference of two means. We first 
show that if x and y are independent and normally distributed about means 
of zero with variances respectively of a x and cr y , then x + y (or x y) is nor- 
mally distributed with zero mean and with variance v\ + 0%. 

A procedure due to Jackson (23) will be used. 

Given <p(x,y), the frequency function for the joint distribution of x and y. 
To find <p(u) where u x + y. 

The frequency function of a single variable may be found by integrating 
the joint frequency function over all possible values of the other variable. 
Thus 



W(x) = f <e(x,y)dy 

9S 00 

Consequently if u = x + y, then 



and therefore 



^ (u) = I <p(x,u x)dx 

t/ 00 

is the frequency function of the variable u = x + y. 
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If x and y are normally distributed about zero, i.e., if 



a = > 6 = 

we have, for x and y independent, 



or 

lKt) = cic 2 f <r^ 2 -*("-*) 2 do: 

J-oo 

Write 

ax 2 + b(u - x) 2 = (a + 6) (* ; u Y + -~- u* 

\ a + b / a + b 

and 

ab 



We have 

6 -(a+&) 2 ^ 



The integration yields a constant multiplied by the entire area under the 
normal probability distribution (unity). 

= ~ cu 2 



<r-<T 

= c e = c e *\ ff x+ ff y) 

which proves the theorem. The result for x y is the same. 

The theorem may easily be generalized to n variables. 

If av , x n are independent and form a random sample from a normal popu- 
lation of variance a 2 , then xi + - + x n is normally distributed with variance 
no- 2 . Also 



will be normally distributed with variance <r 2 /n. 

To prove the normality of the sum and of the mean of observations, write 
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Now the already proved theorem on the normality of the sum of two vari- 
ables will apply to (x\ + #2) and #3. The extension to the sum or the mean 
of n variables is obvious. 

1.35 Variance of the mean and the difference of means. As for the vari- 
ances of the sum of n variables and of their mean it may be well to show that 
these relationships (and those already found on variances) are independent of 
theorems on normality. Let the variances of X and Y be respectively a\ and 
(7y, where, for large samples 

2 E/*(* - I) 2 2 E/ y (F - F) 2 

ffx - , 0- - 

n x n v 

f x and/j, being the frequencies of X &nd Y respectively. Frequently we write 

J2(X X) 2 

the variances in the form - , the frequency f x being implied though 

n 

not explicitly introduced. 

For continuous populations with means X and Y set equal to zero, these 
definitions are 



rl = I x z <p(x)dx <?l = I 

J CD */ _ C 



where x = X - X and y = Y - P. 

By definition of the variance we have for the variance of x + y where x and 
y are independent 



But 



Therefore 



/CO X00 

I [(x + y) - 0]V 
- ODf/ 00 

/oo ^co >noo xoo 

I x z p(x)t(y)dx dy + I I y 2 <f>(x)t(y)dx dy 
.ooj-co J-ooJ-oo 

+ 2 / / xy<p(x)t(y)dx dy 

t /-ODt/-09 

// xytf>(x)$(y)dx dy = I x<p(x)dx I y$(y)dy = 
o/ ee> t/ oo t/ oo 

r /< x x 

I x^<p(x)dx I $(y)dy -f- I y^fr(y)dy I ^(aj)cte 

t/ 00 V CO t/ 00 V 00 



since 



t(y)dy I 

,00 t/ o 
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Note that 



These results have already been reached in the special case of normal x 
and normal y. If x and y are from a population of variance <r 2 , 

2 2 



or for n variables 

For the variance of the mean of x + y 



= J J ( -- j <p(x)t(y)dx dy 



2 , 2 

= ff * + V 
4 

For the case in which x and y are from a population of variance <r 2 



For the mean of n variables from a population of variance cr 2 



ncr <7 
= 

l 



n* n l n 

which was to be shown. 

1.36 Mean estimate of cr 2 . Given a population of kn observations divided 
so that there will be k samples with n observations in each sample, k being very 
large. We wish to form an estimate of the population variance <r 2 , the un- 
known true value of cr 2 being 



kn 
Write 



- X + X - Z')' 



n n 

n 



The cross product term is zero, for X X' is a constant and ^(X J?) 
is zero. Writing the variance of the sample as 

(X - I) 2 ., 
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we have n 

^ (X ~ *' )2 = s 2 + (X - X 7 ) 2 
n 

If this expression is summed over k samples and the result divided by fc, 
we obtain 

T. f. T. 

K n K K 



nfc k k 

Write s 2 for the mean sample variance ^,s 2 /k. Now 

k 



is seen to be the variance of the mean and is therefore equal to <? 2 /n. We have 



n 

" " 



i.e., the " mean " estimate of a 2 is 



n - 1 

If cr 2 must be estimated from the data of a single sample, the estimate is 

n 2 
- s 2 
n - 1 

which is equivalent to 



n - 1 

This estimate 5 2 is thus shown to be the mean (unbiased) estimate of o- 2 . 
It is also " best " in the sense that the variance of 5 2 is a minimum, but this 

J2(X j?) 2 

we do not show. It may be noted that while - is the mean esti- 

n - 1 



_ 
KTVJ^ __ j^\2 

of <r 2 , ^ / - i 
\ n I 



mate of <r 2 , ^ - is not the mean estimate of a\ this fact is unim- 
\ n I 

portant relative to the tests of significance used in this chapter. 

1.37 Nature of the t test. It was shown that if we have a normal popula- 
tion of mean X' and variance a 2 the means of random samples of size n are 
normally distributed with mean X X f and variance 2 /n. Thus, if we 
reduce any deviation, say X X' to standard error units by forming 

X - X' X-X' 
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the probability of exceeding u is given by 
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Values of this probability integral may be found by subtracting the entries in 
Table IV from 0.5. 

If a 2 is unknown, the best estimate of <r 2 is 



n - 1 



but 



t = 



X - 



s /V n _ 1 



is not normally distributed, particularly not for small sample sizes, for when 
n is small, the standard deviation s varies considerably from sample to sample. 
The variability of s was discussed by 
earlier writers but to " Student " (39) 
belongs the credit both for recognizing 
the practical importance of the problem 
and for an approximately correct solution 
to the problem of the distribution of t. 

To find the distribution of t, " Student " 
first found, by approximate methods, the 
distribution of s 2 . He began by finding 
the first four moments of the distribution 
of s 2 in terms of the second moment a 2 of the normal parent population. 

The moments M k of s 2 about the left end of the range (s 2 = 0) are found 
from simple expansions which yield, for examples 







- o) 



Similar expressions may be found for the third and fourth moments MS 
and Af 4 in terms of (r 2 . These expressions are easily transformed to moments 
about the mean of s 2 , and from these statistics the values of V fa and 6 2 are 
computed. The values of V fa and 6 2 indicate that a Pearson Type III curve 
will fit the distribution of s 2 , from which the ordinate of the distribution of s 2 
is found to be 

y* = ci(s 2 ) (w - 3)/2 - n 
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and the ordinate of the distribution of s, 

y 9 = c 2 s n ~ 2 e~ n ' a/2<5r2 

GI and 02 being constants. 

" Student " then partially proved that X and s 2 were independent of each 
other. Thus, knowing the mean to be normally distributed and the standard 
deviation to be distributed as given above, he found the ordinate of the dis- 
tribution of the ratio t to be 



[<f>(n)] ' 



(2\l/2(n+l) 
n) 



a distribution which js symmetrical about t = 0, which is more peaked than the 
normal curve but which approaches normality as n becomes large, <p(n) is 
known. Table V gives values of 

ytdt 

R. A. Fisher (15, a) later gave an exact proof of the distribution of t and of 
the complete independence of the mean and the variance of random samples 
drawn from a normal population. 

1.38 Mean estimate of cr 2 from two small samples. If the quantity 



nx + TIY 2 

is summed over a large number, k, of samples and the sum divided by k, the 
resulting mean or " expected " value will be found to be equal to (T 2 . This is 
easily demonstrated if we make use of three elementary properties of E(X)> 
the expected value of a variable X. 

E(X) = mean X 

E(cX) = c mean X, where c is a constant 
E(X + F) = E(X) + E(Y) 
We have 



2 1 
J 



- 2 n x + n Y - 2 



But 

. _ 0.2 an( J 

n x 
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whence 



+ n Y - 2 



2 1 

J 



The mean value of the given function is o- 2 . Hence the function is an 
unbiased estimate of <r 2 . As in the case of estimating or 2 from a single 
sample, the mean value given above is the "optimum" or best estimate of <r 2 , 
in the sense that it has minimum variance. 

1.39 " Student's " t applied to the difference of means. The application 
of the distribution of t to problems involving the difference of the means of 
two small samples arises from the fact that t is essentially the ratio of a nor- 
mally distributed variable X to an independent estimate of the standard error 
ofZ. Write 

-X'Y 



- yv = / * - *' v 

/^n J V/Vn - I/ 



^ _ j\2 

Replacing a by 1^=^^ and dividing numerator and denominator by 

cr 2 we obtain 

/JT-1V 

--.) ^-' 



J? X' 

The numerator T=r is normally distributed about zero with unit standard 



(X ^\ 2 
J is a function both of ^(X X) 2 

and of the number of independent observations on which the estimate of <r 2 
is based and its distribution is known. R. A. Fisher (15, a) was the first to 
note that any statistic which could be expressed as the ratio of a normally 
distributed variable to the square root of such an independently distributed 
estimate of the variance of that variable would be distributed as t with degrees 
of freedom equal to the number of independent observations from which the 
estimate of the variance was made. 

This condition is satisfied in a difference of means test. If from normal 
populations of means X 1 and P' and variance o- 2 we draw two random samples 
of sizes nx and ny and means X and_P (optimum estimates of X f and P')> 
we already know that frequencies of X and Y are distributed normally about 
%' and Y f with respective variances of <r*/nx and <r 2 /nr. We have shown 
that X - P is normally distributed about X' P' with variance 
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The unbiased (and optimum) estimate of <7 2 is 

- I) 2 + L(F - F) 2 



n x + n Y - 2 

which is independent of .X F. Hence 

(X - F) - (X' - F') 



nx 



nx + n Y 2 

is distributed as t with n x + n 2 2 degrees of freedom. _ In our examples 
and in general we are attempting to infer whether or not X' = F'. Thus if 
X' F' = lies beyond the 5 per cent level of t we conclude that the optimum 
estimates X and F are significantly different, i.e., X' ^ Y'. 
1.40 Correlation and the t test. Given 

Xi FI 



An J n 

We have tested the difference of means of small samples of X and F in two 
ways. With unpaired variates we computed 

X - Y 




which is distributed as t with 2n 2 degrees of freedom. With paired 
variates, we formed 

X l - Fi = 4 

Z 2 - F 2 = d 2 



X n -Yn = d n 

and then computed 



x -F =a 
a a 



and 



which is distributed as t with n 1 degrees of freedom. 
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Certain features of the two situations are brought out in the following ex- 
ample. From two normal populations we draw the two following samples : 

X Y 

6 7 

4 5 

8 9 
3 4 

9 10 
X = 6 Y = 7 

We find 

- Y X-Y 



t = 



/J- + -L fc (X ~ X)2 + (y " f) Y l i l 

\n x n y \ n x + n y - 2 \n x n Y 




= -0.620 



which for eight degrees of freedom yields P = 0.55; the difference is not 
significant. 

If we apply the same method to the following data, X and Y having the 
same variates as in the preceding case, 

X Y 

6 7 

4 4 

8 10 
3 9 

9 5 
Z = 6 F = 7 

we obtain exactly the same result. But the two sets of data differ strikingly. 
In the first set, whenever X is greater than X, the Y paired with that X is 
greater than Y and whenever X is less than X, the paired Y is less than Y. 
In the second set of data, on the other hand, there appears to be, little correla- 
tion between X and 7; for example, when X is greater than X, Y is in one 
case greater than Y (X - 8, Y = 10) whereas in another case Y is less 
than P (X = 9, Y = 5). 

Now consider the value of the correlation coefficient r where 

E(X - X)(Y - Y) 



for both of the above cases. The value of the numerator of r varies from oo 
to +00 with the amount and nature (negative and positive) of the correlation 
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between X and Y. The effect of the denominator is to reduce this variation 
to the range 1 to +1 and to eliminate the effect on r of the units in which 
X and Y happen to be expressed. If the relationship between X and Y can be 
described perfectly by the linear regression function Y r = a + bX (see 
Ch. IV), then r = +1. Similarly, if the deviations X - X and Y - F are 
independent of each other, i.e., if there is no correlation, we have r 0. 

These are only a few of the important properties of r, but for our present 
purpose, no other properties will be needed. 

For the first set of data r +1, whereas in the second set r = 0. 

The first difference of means test clearly does not distinguish between cases 
in which X and Y are correlated and those in which no correlation is present. 

If the test 

d 



is applied, we obtain for the first set of data 

d 

-1 
-1 

-1 
_ i 



or t is infinite, i.e., the mean difference d = 1 is certainly significant. For 
the second set, in which r = 0, we find 



t = L . = -7=== = ~0.620 

52 




(n - l)n \ 4 5 



exactly as before, but now with four instead of eight degrees of freedom. 

In the case of positively correlated X and F, elimination of the correlation 
by forming differences showed that the mean difference of 1 was highly 
significant; on the other hand, in the case of uncorrelated X and Y the same 
test was less sensitive than the ordinary difference of means test, for we 
obtained the same value of t with a loss of four degrees of freedom. 

We may note that the ordinary difference of means test may be modified 
so that it is equivalent to the second test. In place of 



2(n - 2) 
as an estimate of <r 2 in the original test, we use 

- I) 2 + S(F - Y ) 2 - 2E(Z - X)(Y - 7) 

- - > 

2(n - 2) 



THE DIFFERENCE OF TWO MEANS 51 

an estimate of a 2 based on n 1 degrees of freedom for the cross product 
reduces the number of independent observations by n 1. For the first set 
of data, CJr - X)(Y - Y) = 26, we would obtain 

y = 26 + 26 - 2(26) 2 _ Q 
8*5 

or t is infinite for four degrees of freedom, which corresponds to the result 
obtained when the correlation was eliminated by the alternative method of 
forming X* Yi. 

It is not possible to say, in advance of actual trial, which of the two tests 

X - P 

2n 2 degrees of freedom 




n I degrees of freedom 

will be the more sensitive in a paired experiment. If the variables X and Y 
are positively correlated, either forming differences or using [9] as an estimate 
of a 2 in the ordinary difference of means test will reduce the variance of (and 
hence increase the significance of) the difference X Y. This gain may be 
nullified by the loss of half of the original number of independent observations. 
In our examples this loss (from eight to four degrees of freedom) is of no 
importance because cr 2 declined from 26 to 0; this is, of course, an extreme 
example. If the variates are unpaired in the original experiment, the second 
method is not available. 

R. A. Fisher (15, a) has summed up the matter in the following sentence: 
" When both methods are available, sometimes the one and sometimes the 
other is the more sensitive; if either shows a significant deviation, its testi- 
mony cannot be ignored." 



CHAPTER II 
DIFFERENCES AMONG SEVERAL MEANS 

2.1 Example of a simple experimental arrangement. An industrial 
experimenter wishes to compare the effects of five types of grids, A, B, 
C, D, and E, on the vacuum of radio tubes. With each type of grid he 
uses five tubes. The results, expressed in terms of a measure of vacuum, 
follow 



A 


B 


C 


D 


E 


93.6 


95.3 


94.5 


96.8 


94.6 


95.3 


96.9 


97.0 


98.2 


97.8 


97.0 


95.8 


97.8 


97.2 


98.0 


93.7 


97.3 


97.0 


97.2 


95.0 


98.0 


97,7 


98.3 


97.9 


98.9 



2.2 General nature of the analysis. Even if the five types of grids 
are the same, the five column means, i.e., grid means, are not likely to 
be identical. For if from five populations (which we shall assume to be 
normal) which are alike in their means as well as in their variances, five 
random samples each of five observations are drawn, the five sample 
means will differ among themselves by chance. Our problem is to 
determine whether or not the observed variation in column means can 
be so explained. If it cannot, the hypothesis that the five normal popu- 
lations are alike in their means and variances is rejected. Then, if it can 
be shown that the data do not refute those parts of the hypothesis 
covering normality and equality of variances, it will be concluded that 
the means of the five populations differ significantly among themselves, 
i.e., the five types of grids differ significantly, in a statistical sense, in 
their effects on vacuum. 

If the five types of grids are alike in their effects on vacuum, the 
column means will vary about their mean by an amount which is deter- 
minable from the variation of the individual observations in the columns 
about their respective column means. For if the only unidentifiable 
factor (differences among grids) is without effect, both variations among 
column means and within columns are allocable to the same host of 
unidentifiable factors. Notice that it is not stated that, if grids are 

52 
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alike in their effects, the variation among column means will be equal 
to that within columns for, though both variations are caused by the 
same forces, means will practically always vary less than the individual 
observations of which they are formed. 

If differences among grids really affect vacuum, this calculable rela- 
tionship between variation among column means and variation within 
columns does not exist. For while variation within columns is still 
caused only by unidentifiable causes, variation among column means is 
now attributable to these factors and to real differences among grids. 
This brings us to the nature of the test of significance, later to be called 
the F test. First, interpreting the unallocable variation within columns 
as the error of the experiment, we can set limits on the amount of varia- 
tion that would be expected among the column means if the same 
host of unidentifiable factors affect both. If the observed variation 
among means is outside these limits, the hypothesis that the grids are 
without effect is rejected. 

2.3 Randomization of factors. In the experimental arrangement 
described in 2.1, all factors which might affect vacuum (other than 
grids) must be allocated at random. For example, assume that several 
sealing machines are used. If all tubes with grid A are sealed by the 
first machine and all tubes with grid B are sealed by the second machine, 
any conclusions regarding the effect on vacuum of differences among grids 
are vitiated, for the observed differences among column means are allocable 
to machines and/or to grids. Such vitiation can be precluded by assign- 
ing machines to grids at random. This applies to all influential factors. 

The experimental arrangement shown in 2.1 will be called a com- 
pletely randomized arrangement. 

2.4 Magnitude of the error. The error of the completely random- 
ized experiment can be taken to consist of variation in vacuum unex- 
plained by differences among the grids. This variation is made up of 
the effects of differences among sealing machines, personnel, etc., and is 
directly measured by the variation of observations within columns, for 
such variation does not involve differences among gride. Thus, if 
several sealing machines are used on the five tubes containing grid 
type A, the variation of the observations in the first column about the 
mean of the first column is partly the result of differences among 
machines. If the operators used on the machines are of different skills, 
the result will be still greater variation among the observations on 
vacuum within each column, that is, still larger experimental error. 

2.5 Complete control. Experimental error can always be .reduced 
by holding constant all factors except the one under investigation. 
If only one sealing machine is used for all 25 tubes, differences among 
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sealing machines no longer contribute to the error of the experiment. 
And if only one operator is used, the same is true of differences among 
operators. 

It is often impossible to achieve, simultaneously, complete control 
over all influential factors. For example, in an experiment dealing 
with variation in the quality of yarn, the use of a single loom would 
prolong the experiment over many weeks and the experimental error, 
decreased by the absence of loom differences, would be increased by the 
influence of factors which change with the passage of time, such as 
workroom humidity, operator efficiency, etc. If this source of varia- 
tion is reduced by conducting the experiment in a single week, many 
looms will be required and loom differences reenter. In any case, the 
types of experimental arrangements we shall now describe obviate the 
need for complete control; in addition, they can often be made to yield 
valuable information which cannot be obtained from completely 
controlled experiments. 

2.6 Latin Square. The Latin Square is an arrangement which 
permits at least two factors (other than the one being studied) to vary 
during the experiment, and yet it excludes the principal component of 
their variation from the error of the experiment. Assume that sealing 
machines and operators are two factors which might affect vacuum. If 
five grids are to be compared, the Latin Square arrangement requires 
five machines and five operators. The machines and operators are 
allocated to grids (A, B, C, D, and E) in such a way that the separate 
grids, machines, and operators are associated in the same trio only once. 





Machine 


1 


2 


3 


4 


5 


Operator 


1 


E 


B 


D 


A 


C 


2 


C 


D 


B 


E 


A 


3 


A 


C 


E 


B 


D 


4 


D 


E 


A 


C 


B 


5 


B 


A 


C 


D 


E 



In order to appreciate the merits of this arrangement, consider the 
completely randomized experiment. In that experiment, two types of 
variation were noted, namely, variation among the grid means and the 
unallocable variation of the individual observations about their respec- 
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tive grid means, i.e., variation within grids. There is no other source of 
variation in that experiment; the total variation, i.e., the variation of 
the 25 observations about the grand mean is made up of these two 
variations. 

Now assume that the earlier data were obtained from the following 
Latin Square arrangement. 





Machine 


1 


2 


3 


4 


5 


Operator 


1 


E 

98.0 


B 

95.8 


D 

97.2 


A 

97.0 


C 

97.8 


2 


C 

98.3 


D 

97.9 


B 

97.7 


E 

98.9 


A 

98.0 


3 


A 
93.6 


C 

91.5 


E 
94.6 


B 
95.3 


D 

96.8 


4 


D 

97.2 


E 
95.0 


A 

93.7 


C 

97.0 


B 

97.3 


5 


B 

96.9 


A 

95.3 


C 

97.0 


D 

98.2 


E 

97.8 



The total variation is the same as before. The grid means are 
unchanged; hence the variation among grids is the same as before. 
If we subtract variation among grids from the total variation, we 
obtain a term, say 6, which must be numerically equal to the error 
term in the randomized arrangement, i.e., the term called variation 
within grids. But while the latter represented variation unallocable to 
any specific factor or factors, the b of the Latin Square can be divided 
into three parts, two of which are allocable to specific factors and the 
third of which is unallocable, i.e., unidentifiable. 

In the present example the two new identifiable factors are machines 
and operators. The variation due to differences among machines (vari- 
ation among column means), which contributed heavily to the error 
of the completely randomized experiment, is removable from the error 
of the present experiment; for inasmuch as each grid and each operator 
have been used an equal number of times (once) with each machine, 
the removal of the effect of machine differences cannot vitiate the com- 
parison of grids (or of operators). In fact, in a Latin Square it is not 
possible to attribute the mean effect of any one factor to either or both 
of the remaining two factors. The effects of the three factors are com- 
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pletely separated; each effect is measurable and removable without 
interference with the others. 

In the completely randomized experiment, it was not possible to 
remove machine effects for, first, there was no stated record as to which 
machine was used with each grid and operator, and second, even if 
there were such a record, it is unlikely unless deliberately planned that 
just five machines would be used and that each would be used exactly 
once with each grid and each operator. If these conditions are not 
satisfied, machine effects cannot be removed. For example, assume 
that the completely randomized experiment was conducted as follows 
(the numbers in parentheses refer to the different machines) : 

Grid 



A 


B 


C 


D 


E 


93.6 (1) 
95.3 (1) 
97.0 (3) 
93.7 (3) 
98.0 (3) 


95.3 (2) 
96.9 (4) 
95.8 (4) 
97.3 (2) 
97.7 (2) 


94.5 (5) 
97.0 (1) 
97.8 (1) 
97.0 (1) 
98.3 (5) 


96.8 (3) 
98.2 (3) 
97.2 (2) 
97.2 (4) 
97.9 (2) 


94.6 (5) 
97.8 (5) 
98.0 (5) 
95.0 (4) 
98.9 (4) 



Exactly five machines were used, but the machine effect cannot be 
removed for such a step would in part remove any effect of grids. For 
example, the difference in the means of the first and second machines is 
especially entwined with the differences of grids B and C. 

Returning to the Latin Square, it is apparent that if the effects of 
machine and operator differences are statistically significant the experi- 
mental error of the square (error in the sense of unexplainable variation) 
will be less than that of the completely randomized arrangement. In 
the notation of the following table 63 will be less than 6. 

COMPARABLE INDEXES OF VARIATION 



Completely randomized experiment 



Latin Square 



Variation among grids (a) 
Variation within grids (6) 



Variation among grids (a) 
Variation among machines bi 1 
Variation among operators 62 \ (b) 
Unallocable variation 63 J 



Total variation (c) 



Total variation (c) 



2.7 Size of a Latin Square. It is disadvantageous to use many 
machines and many operators for the Latin Square excludes only the 
variation among the row and column means from experimental error. 
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If many machines are used, the error will be large even if machine means 
are identical. The same is true for operators. This limit on the num- 
ber of machines and operators automatically limits the number of types 
of grids which can be compared in a single square. A 10 by 10 square 
may be taken as the maximum. 

A small Latin Square is unreliable for while a Square of any size tends 
to reduce experimental error, the error of our estimate of the true 
value of that error from, say, the nine observations in a 3 by 3 Square 
is high. If as few as three or four grids are to be compared, more than 
one Latin Square must be used. 

2.8 Other considerations. In the Latin Square arrangement, more 
than two influential factors can be admitted and their effects on error 
can be excluded. Assume that along with machine and operator differ- 
ences it is believed that workroom humidity at the time of sealing affects 
vacuum. The various humidities that actually occur may be divided 
into five classes; the first machine is used only at the lowest humidity, 
the second only at another humidity, etc. What was in the original 
Latin Square variation allocable to machines is now variation allocable 
to machines or humidity or both. As we no longer know exactly what 
causes this variation, there has been some loss of information. How- 
ever, if the exclusive purpose of the experiment is to distinguish among 
grids, this loss is of no importance and there may be a gain in the form 
of a further reduction of error by the exclusion of the effects of a 
third factor, humidity. 

Superior arrangements for handling more than two factors, such as 
Graeco-Latin Squares, will not be discussed here. 

The Latin Square is an experimental arrangement in which the alloca- 
tion of machines and operators is subject to a double restriction: Each 
machine must be used once with each grid and each operator; also each 
operator must be used once with each grid and each machine. Many 
Squares satisfy these requirements (for example, the rows in a Latin 
Square may be interchanged). The Squares actually used can be 
selected at random by many card-drawing schemes, which the reader 
can easily arrange for himself. 

2.9 Randomized blocks. The Latin Square arrangement excludes 
the effect of at least two of the factors, say machines and personnel, from 
the unexplained variation to which differences in the third factor, grids, 
are compared. Assume that it was known that one of the two factors 
was without effect, for example, that sealing machines do not differ 
among themselves in their effect on vacuum. Only the effect of differ- 
ences among operators need be excluded, and the following plan, known 
to agronomists as a randomized block arrangement, is appropriate. 
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Operator 



1 


2 


3 


4 


5 


95.8 (B) 
97.0 (A) 
97.8 (C) 
97.2 (D) 
98.0 (E) 


98.0 (A) 
98.3 (C) 
97.9 (D) 
97.7 (B) 
98.9 (E) 


96.8 (D) 
94.6 (E) 
93.6 (A) 
95.3 (B) 
94.5 (C) 


95.0 (E) 
97.3 (B) 

93 7 (A) 
97.0 (C) 
97.2 (D) 


95.3 (A) 
97.0 (C) 
96.9 (B) 
97.8 (E) 
98.2 (D) 



Differences among the means of the blocks, i.e., among the operator 
means, can be removed for each type of grid (the only factor remaining) 
as each type of grid is represented once in each block. Similarly, grid 
differences can be removed, for each operator is represented once with 
each type of grid. Other factors, such as machines and humidity, are 
handled as in 2.8 or are allocated to grids strictly at random in order to 
prevent possible vitiation. 

2.10 Analysis of variance in a completely randomized experiment. 
We shall now consider the method of analysis to be applied to the com- 
pletely randomized experiment. If vacuum is unaffected by grid 
differences, any variation among the five grid means is caused by the 
same unidentifiable factors that cause variation of the individual obser- 
vation " within " each grid. Now conceive of the observations within 
the first column as having been drawn from one normal population, the 
observations within the second column from a second normal popula- 
tion, arid so on, and the grid means as having been formed from samples 
whose items were drawn from a sixth normal population. If these 
normal populations are identical, the six estimates of their common 
variance tend to equality. All within-column variation is, however, of 
the same nature, so we shall reduce the number of estimates to one 
pooled within-column estimate and one estimate based on variation 
among means. 

How are these estimates formed? If we pool the within-grid varia- 
tions for two columns (grids), an unbiased estimate of the population 
variance a 2 is 



fti + n 2 2 

as was proved in 1.38. The proof given there is easily extended to 
k columns; the unbiassed estimate of a 2 formed from the pooled within- 
column variation of all k columns is 



m 



+ n 2 + + n k - k 
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In the notation used here X\ is summed over the n\ values of X in the 
first column, X% is summed over the n^ values of X% in the second column, 
etc. In our example, HI = n> 2 = = nk = 5 and k 5. 

Now consider the estimate of o- 2 formed from the variation among 
column means. The k means X\, X^j , X^ constitute a sample of k 
variates; an unbiased estimate of the variance of the normal popula- 
tion of such variates, i.e., of means, is given by 



k - 1 

But these variates are means, not individual observations, and it 
would be incorrect to expect [2] to be equal to [1]. [2] is an estimate 
of the variance of a population of means whose variance in terms of the 
variance of the individual observations has already been shown to be 
<j 2 /n c where n c is the number of observations from which each mean is 
formed (in the present example, n c = 5). Hence 

,2 IX (x c - X) 2 



k 1 

is an unbiased estimate of o- 2 . 

A third unbiased estimate of o- 2 is obtained from all n observations 
taken together. This estimate is 

W as = ^>- a 



It will be noted that the numerator of each estimate contains as many 
terms as there are observations in the experiment, (n). This is immedi- 
ately apparent in [1] and [4] while in [3] there are k terms (X c X) 2 
and each is weighted by the number of observations n c in the corre- 
sponding column, i.e., a total of n terms. It will simplify our termi- 
nology if we take advantage of this fact and understand the summation 
sign to include n terms. Thus [1] will henceforth be written 



n k 
and [3] will be replaced by 

(X C - 

k - I 
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The results to this point can be summarized in the following table: 



Source of 
variation 


Sum of 
squares 


Degrees of 
freedom 


Mean 
square 




\ "* f TJ- *\f \ 2 


Z* 1 


X " / Tjr V\2 

/\o ^L/ v c "~~ Ay 




f f \ A c A ) 

X " / -yr Tr \ 2 


*i Z- 


Jl fc - 1 




" 




n k 


TVvf-al 


2 


-n 1 


. 2 sc^-^) 2 ' 




(X 




6 n I 



Let Xij represent the ith variate in the jth column, and let 
sent the mean of the jth column. The sums of squares are 

- X) 2 = (X t - X) 2 + (Xi - X) 2 + 
+ (Xj - X) 2 (m terms) 
+ (X 2 - X) 2 + (X 2 - X) 2 + 
+ (X 2 - X) 2 (n 2 terms) 



j repre- 



- X) 2 + (X 4 - X) 2 + 

- X) 2 (n k terms) 

- X,) 2 = (X u - X t ) 2 + (X 21 - Xj) 2 



+ (X 12 - X 2 ) 2 + (X 2a - X 2 ) 2 

+ (X n , 2 - X 2 ) 2 

+ 

+ (X lk - X k ) 2 + (X 2k - Xjt) 2 

+ (X nkk - X,) 2 

E(X - X) 2 = (X n - X) 2 + (X 21 - X) 2 + 
+ (X nil - X) 2 

+ (X 21 - X) 2 + (X 22 - X) 2 + 
+ (X B22 - X) 2 



- X) 2 



- X) 2 
(X nkk - X) 2 
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The most convenient forms for computing the various sums of squares 
are the following 



Tl 



, _ . . . 

where Z^c = -- 1 --- h H 



n k 



In the notes at the end of this chapter it is shown that the total sum 
of squares is equal to the sum of squares associated with among-column 
variation plus that associated with wi thin-column variation. 

The degrees of freedom are also additive. The number of degrees of 
freedom associated with each estimate will be given by the number of 
variates in each summation less the number of constants (means) about 
which the deviations of the variates arc taken. Thus, for total vari- 
ability, we have n values of X in 



less one mean, X (the mean of all the observations), or n 1 degrees of 
freedom. For variability among columns, there are k values of X c in 



less one mean X or k 1 degrees of freedom. For within-column 
variability we have n values of X in 



less k values of X c or n k degrees of freedom. 

2.11 The F test. Now let us review the hypothesis to which this 
analysis is relevant. The hypothesis H states that the k column means 
arise from identical normal populations, i.e., normal populations of the 
same mean X f and the same variance <r 2 . If the hypothesis is true, 
the estimates a\, |> an( i ^1 should be the same, within the allowable 
range of sampling error. If, however, the ratio of say G\ to of is sig- 
nificantly different from unity, the hypothesis must be rejected, i.e., the 
columns do not come from normal populations of the same mean and the 
same variance. Now if the ratio of al to a\ differs significantly from 
unity while (1) the a and vfei tests support normality (or normality is 
assumed) and (2) the LI test supports the hypothesis that all columns 
come from populations of the same variance it follows that the part of 
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H which is untenable is that the columns come from populations alike 
in their means; i.e., the column or grid means differ significantly among 
themselves in their effect on vacuum. 

The value of the ratio F of any two estimates of o 2 will tend to unity 
as the number of independent variates on which each estimate is based 
is increased. The distribution of F for random samples drawn from 
normal populations is known, i.e., the probability of obtaining a value 
of F larger than any given value is known. We compute the ratio of an 
estimate associated with a suspected " cause " to the estimate which 
best defines the error of the experiment. If the probability is small, 
say less than 0.05, that this ratio could have occurred by chance in 
sampling from normal populations of identical means and variances, the 
hypothesis that such were the populations is rejected. 

The numerical analysis of the completely randomized experiment 
follows. 



Source of variation 


Sum of squares 


Degrees of freedom 


Mean square 


Among grids 
Within grids (error) 


10.25 
44.18 


4 
20 


2.56 
2.21 


Total 


54.43 


24 






2.21 

is based on 4 and 20 degrees of freedom. From Table VIII, F would 
have to be as large as 2.87 in order to overthrow the hypothesis. We 
conclude that grid differences are without effect on vacuum. 

2.12 A t test after an F test. Had the entire set of grids differed 
significantly among themselves, the following procedure could have 
been used to determine whether or not the apparent best and second 
best grids differ significantly between themselves. 

The estimate of the variance of a single observation is 2.21. 
The standard deviation is V 2.21 = 1.49. Each grid mean is based on 
five observations; the standard error of a grid mean is accordingly 
1.49/Vs = 0.6664. The difference of any two grid means has a standard 
error 0-3 




1.49 J- = 



The difference of two means, to be significant, should exceed, say, 
2.0865* = 1.97, the figure 2.086 being at the 5 per cent level of t for 
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20 degrees of freedom. The actual difference of the two best grids is 
only 0.54. In fact no two grid means differ in their means by as much 
as 1.97. 

If many grids are studied, two grid means may well differ " sig- 
nificantly " even though the F test indicates over-all homogeneity. For 
even in a homogeneous set of means, the difference between say the 
largest and smallest means will likely appear to be "significant/' 
A t test applied to two means after over-all homogeneity has either 
been refuted or not must be used with caution. 

2.13 Analysis of variance in a Latin Square. In the Latin Square 
let X represent an observation, X g a grid mean, X m a machine mean, 
X an operator mean, and, X ^the grand mean. Let there be n obser- 
vations, G grids, M machines, and operators (n = G 2 ). Four inde- 
pendent estimates of the variance of the population can be found, and 
they are listed in the following table. 



Source of variation 


Sum of squares 


Degrees of freedom 


Mean square 


Among grids 
Among machines 
Among operators 

Residual (error) 


(*. -*) 2 

E(X m -X}* 
L(X -X) 2 
U (obtained by 
subtraction) 


G - 1 

M - 1 
- 1 
V (obtained by 
subtraction) 


E^, - X) 2 /G-1 

Y,(Xm-X}*/M-l 

E(Xo-X) 2 /0-l 

u/v 


Total 


(X- X) 2 


G 2 -1 













For the data shown in 2.6 we have the table below. 



Source of variation 


Sum of squares 


Degrees of freedom 


Mean square 


Among grids 
Among machines 
Among operators 
Residual (error) 


10.25 
12.42 
29.59 
2.27 


4 
4 
4 
12 


2.56 
3.11 
7.40 
0.19 


Total 


54.53 


24 






Machines and operators account for a very large part of the variation 
which, in the completely randomized experiment, constituted error. 
We have, for grids 



which for 4 and 12 degrees of freedom is highly significant. The Latin 
Square arrangement shows that differences among the grids have a real 
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effect on vacuum, an effect which could not be determined from the 
completely randomized arrangement. 

Is the difference between the means of the two best grids statistically 
significant? The variance of a single observation in the Latin Square 
arrangement is only 0.19, as against 2.21 in the completely randomized 
experiment. The standard deviation is Vo.19 = 0.436. We are 
interested in the standard error of the difference of two means, each 
based on five observations, and this is given by 

0.436 

The observed difference in means of the two best grids is 0.54; or 1.96 

standard error units, for 




0.54 



0.43GV* 



= 1.96 



This does not quite reach the 5 
per cent value of t for 12 degrees 
of freedom, which is 2.179. Hence 
the observed difference between the 
two best grids cannot be said to be statistically significant. The accom- 
panying graph illustrates this example; A + B = 0.05. 

2.14 Analysis of variance in randomized blocks. The randomized 
block analysis is given in the table below: 



Source of variation 


Sum of squares 


Degrees of freedom 


Mean square 


Among grids 
Among operators 
Residual (error) 


10.25 
29.59 
14.69 


4 
4 
16 


2.56 
7.40 
0.92 


Total 


54.53 


24 






The effect of grids is not quite significant. This experiment has rela- 
tive to grid differences a large experimental error (resulting from inclu- 
sion of machine effects in the error term), so that detection of differences 
in grid effects is not possible. 

2.15 Further examples. Tippett (42, b) has described two textile 
experiments, one of which used randomized blocks and the other a Latin 
Square. The data from one of these experiments are shown immedi- 
ately below but for the moment are assumed to come from a completely 
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randomized arrangement. The experiment in question was designed to 
determine the effect of differences in roller weightings on the strength of 
yarn. Three roller weightings A, B, and C were used and there were 
four strength tests for each weight. The quality of the yarn is measured 
by the product of lea strength in pounds and count. 



A 


B 


C 


1577 


1535 


1592 


1690 


1640 


1652 


1800 


1783 


1810 


1642 


1621 


1663 



The analysis of variance follows: 



Source of variation 


Sum of squares 


Degrees of freedom 


Mean square 


Among weightings 
Within weightings 


3,000.8 
83,982.2 


2 

9 


1,500.4 
9,331.4 


Total 


86,983.0 


11 













No test of significance is necessary, for F is less than unity. It must 
be concluded that the effect of differences in weighting is not statistically 
significant. 

Actually this experiment employed a randomized block arrangement, 
the rows shown in the preceding table representing different sets of 
roving bobbins. 





Roller weighting 


A 


B 


C 


Roving set 


1 


1577 


1535 


1592 


2 


1690 


1640 


1652 


3 


1800 


1783 


1810 


4 


1642 


1621 


1663 



In the earlier description row differences were unallocable and varia- 
tion among rows was an important element of the large experimental 
error. In the actual randomized block arrangement rows are identified 
as roving sets and, as each weighting is represented once in each roving 
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set, the roving effects can be removed without interfering with weight- 
ing effects. The result as seen below will be a sharply reduced experi- 
mental error. 



Source of variation 


Sum of squares 


Degrees of freedom 


Mean square 


Among roller weights 
Among roving sets 
Residual (error) 


3,000.8 
82,619.5 
1,362.7 


2 
3 
6 


1,500.4 
27,539.8 
227.1 


Total 


86,983.0 


11 













We find 



which for two and six degrees of freedom is statistically significant, 
(P < 0.05). The evidence of this more sensitive experiment indicates 
that differences among roller weights do affect the strength of yarn. 

The second experiment was designed to measure the effect of varia- 
tions in sizing treatments on warp breakage rate. There are four 
treatments A, B, C, and D; there are two factors, loom and time differ- 
ences, whose effects we wish to eliminate. For this problem a Latin 
Square arrangement is especially advantageous for, as already men- 
tioned, neither of the two influential factors could be held constant 
throughout the experiment without augmenting the effect of the other 
on the error. If a single loom is used throughout, many time periods 
(weeks, approximately) will be needed, and variation associated with 
time will increase the error; whereas if the experiment is completed in a 
single week, many looms will be needed and loom differences will mount. 
The Latin Square arrangement effectively eliminates the principal effect 
of both sources of variation. 





Loom 


1 


2 


3 


4 


Period 


1 


44 (D) 


54 (A) 


71 (C) 


29 (B) 


2 


22 (C) 


59 (B) 


100 (D) 


22 (A) 


3 


31 (A) 


40 (C) 


79 (B) 


38 (D) 


4 


27 (B) 


83 (D) 


100 (A) 


29 (C) 
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Source of variation 


Sum of squares 


Degrees of freedom 


Mean square 


Among looms 
Among periods 
Among sizes 
Residual (error) 


9,025.0 
370.5 
1,389.5 
254.0 


3 
3 
3 
6 


3,008 
124 
463 
42 


Total 


11,039.0 


15 






We have 



,--., 



which for three and six degrees of freedom is highly significant. 

An F test applied to periods yields P > 0.05, i.e., the absence of a 
significant effect. Thus, while removal of the effect of loom differences 
clearly improved the precision of the experiment, the same is not true 
of time differences. Were such an experiment to be performed again, 
the question arises as to whether a randomized block arrangement should 
be used, for only one factor (looms) has a significant effect. With the 
present data, a randomized block arrangement (looms as in the Latin 
Square but periods allocated at random) would give the information 
shown in the accompanying table. 



Source of variation 


Sum of squares 


Degrees of freedom 


Mean square 


Among looms 
Among sizes 
Residual (error) 


9,025.0 
1,389.5 
624.5 


3 
3 
9 


3008 
463 
69.4 


Total 


11,039.0 


15 













which yields 0.01 < P < 0.05. 

In this instance there is little to choose between the two arrangements, 
and inasmuch as we do not know in advance which factors are important, 
the Latin Square is preferable, at least for the original experiment. The 
advantage of the randomized block arrangement over the Latin Square 
is that the degrees of freedom wasted on an unimportant influence 
(periods) in the latter are allocated to error in the former. Thus the 
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error mean square of the block, while it happens to be larger, is more reli- 
able, for it is based on nine rather than six independent variates, and the 
value of the ratio F needed to attribute significance to differences among 
sizings is smaller (3.86 as against 4.76). In the present example, this 
increase in the number of independent observations (from 6 to 9) is 
offset by the increase in the mean square (from 42 to 69.4). The effect 
of periods is not significant but it is much larger than the error of the 
Square, and this eliminates whatever advantage there might otherwise 
have been in a randomized block arrangement. 

2.16 Other examples involving analysis of variance. Campbell and 
Lovell (6) give the following data on six independent sets of laboratory 
knock-ratings of a fuel. 



Set 1 


Set 2 


Set 3 


Set 4 


Set 5 


Set 6 


70.5 


69.7 


70.5 


71.4 


71.0 


69.5 


71.9 


70.5 


70.7 


70.5 


71.3 


70.6 


71.0 


70.4 


71.0 


71.2 


70.8 


71.5 


71.5 


70.2 


70.5 


70.8 


70.7 


70.6 


71.1 


71.0 


70.3 


70.1 


69.8 


70.2 


70.1 


71.0 


71.2 


70.8 


70.5 


70.6 


69 8 


71.4 


70.1 


71.4 


70.6 


71.0 


70.5 


70.5 


71.0 


71.0 


70.0 


70.8 


70.0 


70.8 


70.4 


71.0 


69.9 


71.4 


71.1 


70.9 


70.0 


70.6 


70.8 


70.1 




70.5 


71.1 


70.6 




70.2 




71.2 


71.0 


70.4 











71.4 






.... 







71.0 












71.0 












71.2 













70.4 










Do the laboratory mean ratings differ significantly among them- 
selves? Or may the six sets of ratings be combined? 

The principal difference between this and preceding examples of 
completely randomized arrangements lies in the fact that the column 
means X c are not equally important, for they are based on different 
numbers of observations. This fact affects somewhat the validity of 
the following analysis, but we shall assume this effect to be slight. 

The within-sets sum of squares is 
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and the among-sets term is 
- X) 2 = 



776.5 



/50901> 
~ 72 V 72 , 

The analysis of variance follows: 



Source of variation 


Sum of squares 


Degrees of freedom 


Mean square 


Among sets 
Within sets (error) 


0.63 
16.86 


5 
66 


0.126 
0.255 


Total 


17.49 


71 













Actually, the mean square error is less (although not significantly 
less) than the mean square associated with variability among sets. No 
F test will be applied. The differences among the means of the six sets 
are not significant; the sets are homogeneous and they can be combined. 
The mean will be 70.7 and the standard error of the mean V (17.49/71 )/72 
= 0.059. 

If the data satisfy the assumptions underlying the method of analysis 
of variance, variation attributable to the action of specific factors and/or 
to their interaction can always be isolated. We shall give several examples. 

Assume that two factors (and their interaction, see 2.17) are suspected 
of being responsible for variation. These three factors are isolated and 
the variation due to each is compared with the variation due only to 
experimental error; we thereby determine whether or not our suspicions 
are justified. 

The data must provide a satisfactory estimate of experimental error 
and in the examples to be given this is true; for each value of the sus- 
pected causes (in the first example these causes are differences among 
lots and differences among rolls) there are several (three) values of the 
variable being studied (porosity). Within each set of three readings on 
porosity there is no change of lot or roll; whatever differences there are 
within each set of three readings are attributed to a large number of 
independent and unknown causes, each of which has a small effect. In 
short, these differences constitute experimental error. 

Rider (34, a) gives the following Western Electric Company data on 
the porosity of condenser paper. Three readings are made on each of 
nine rolls from each lot. 
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Reading 


Roll number 




number 


1 


2 


3 


4 


5 


6 


7 


8 


9 






1 


1.5 


1.5 


2.7 


3.0 


3.4 


2.1 


2.0 


3.0 


5.1 




I 


2 


1.7 


1.6 


1.9 


2.4 


5.6 


4.1 


2.5 


2.0 


5.0 






3 


1.6 


1.7 


2.0 


2.6 


5.6 


4.6 


2.8 


1.9 


4.0 






1 


1.9 


2.3 


1.8 


1.9 


2.0 


3.0 


2.4 


1.7 


2.6 


T rt 


























II 


2 


1.5 


2.4 


2.9 


3.5 


1.9 


2.6 


2.0 


1.5 


4.3 


number 




3 


2.1 


2.4 


4.7 


2.8 


2.1 


3.5 


2.1 


2.0 


2.4 






1 


2.5 


3.2 


1.4 


7.8 


3.2 


1.9 


2.0 


1.1 


2.1 




III 


2 


2.9 


5.5 


1.5 


5.2 


2.5 


2.2 


2.4 


1.4 


2.5 






3 


3.3 


7.1 


3.4 


5.0 


4.0 


3.1 


3.7 


4.1 


1.9 



The method of analysis of variance does not automatically suggest the 
appropriate breakdown of the data. Thus we might study the variance 
resulting from the differences between the means of rolls X r (including 
all lots) and the grand mean X, the corresponding sum of squares being 



This would have slight value, for the position, say number 1, of a roll 
has no real meaning from lot to lot. The appropriate breakdown of the 
total sum of squares is 



Total 
- X) 2 = 



Among 
lots 

!- X) 2 



Among rolls 
within lots 

- X,) 2 



Among measurements 
within lot-rolls 



where Xi is the mean of a lot including all rolls, X r i is the mean of a roll 
for a given lot, etc. Notice that, if the summation signs and squares are 
removed, the symbols " cancel out." Analysis of variance yields the 
following table: 



Source of variation 


Sum of squares 


Degrees of 
freedom 


Mean square 


Among lots 
Among rolls within lots 
Among measurements 
within lot-rolls (error) 


7.90 
92.87 

42.29 


2 
24 

54 


3.95 
3.87 

0.78 


Total 


143.06 


80 





As before, we may use as a practical rule the fact that each sum of 
squares has degrees of freedom equal to the number of variates summed 
less the number of independent relations between the variates. In the 
present example there are eighty-one observations so the variance esti- 
mate involving (X X) 2 will be based on eighty degrees of freedom, 
the mean X having been calculated from the observations. The among- 
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lots sum of squares !L(Xi %) 2 involves three lot means less one 
relationship between them (again the grand mean); hence there are 
two degrees of freedom for the estimate based on this sum of squares. 
For among-rolls-within-lots, with sum of squares lL,(X r i %i) 2 there 
are twenty-seven values of X r i less three values of Xi, or 24 degrees of 
freedom. In the among-measurements wi thin-lot-rolls, sum of squares 
(X X r i) 2 , 81 values of Xi are summed but there are 27 relations 
among the Xi (27 roll-lot means X r i), leaving 54 degrees of freedom. 

Another breakdown of the data would involve a split of the " among 
rolls within lots " sum of squares 

- X,) 2 



into two terms, first, among rolls, 



and a term which, as will be demonstrated, shows the joint or inter- 
action effect of lots and rolls on porosity 

Z(X rl - X r - X t + X) 2 

As in the previous breakdown, there will be 2 degrees of freedom for lots 
and 54 degrees of freedom for error. But the 24 degrees of freedom 
previously allocated to " among rolls within lots " must now be allo- 
cated to (a) among rolls and (b) interaction of lots and rolls. There are 
nine rolls and one restriction in the form of the grand mean, so the 
among-rolls estimate of the population variance is based on 8 degrees of 
freedom. By subtraction, 16 degrees of freedom are allocable to the 
interaction estimate of <? 2 . Again notice the cancellation of symbols 
when the exponents and summation signs are removed. 

The degrees of freedom allocated to the interaction (of lots and rolls) 
estimate will be shown at the end of this chapter to be the product of the 
degrees of freedom allocable to the constituent factors (among lots 
with 2 and among rolls with 8 degrees of freedom). 

Analysis of variance yields the following table. 



Source of variation 


Sum of squares 


Degrees of freedom 


Mean square 


Among lots 
Among rolls 
Interaction of 
rolls and lots 
Error 


7.90 
26.32 

66.55 
42.29 


2 

8 

16 
54 


3.95 
3.29 

4.16 
0.78 


Total 


143.06 


80 





But differences among rolls for all lots have no practical significance 
whereas roll differences within lots are meaningful; hence the earlier 
breakdown of variability in quality is to be preferred. 
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The F test, as used in the analysis of variance, is essentially the ratio 
of variability associated with a suspected cause to error. For rolls 
(in the earlier breakdown) the appropriate ratio is 

3.87 



Drolls = 



0.78 



= 4.96 



From Table VIII, for 24 and 54 degrees of freedom, the 1 per cent level 
value of F ro iifl = 2.16. The actual value of F exceeds this critical value. 
Variation in porosity is therefore partly attributable to differences among 
rolls within lots and, if possible, these differences should be eliminated. 

In judging the variability among lots, the appropriate error sum of 
squares is 92.87 plus 42.29, with 24 plus 54 degrees of freedom, for both 
factors associated with these quantities clearly contribute to the error 

3 95 
of comparing lot means. We have F = - L = 2.28 which, from Table 

1.7<L> 

VIII, for 2 and 78 degrees of freedom, is not significant. 

Rider (34, a) gives the following Western Electric Company data on 
impact strength, in foot-pounds, of specimens of insulating material. 
The specimens were cut lengthwise and crosswise from the sheets as 
indicated. 





Specimen 


Lot number 




number 
















I 


II 


III 


IV 


V 






1 


1.15 


1.16 


0.79 


0.96 


0.49 






2 


0.84 


0.85 


0.68 


0.82 


0.61 






3 


0.88 


1.00 


0.64 


0.98 


0.59 






4 


0,91 


1.08 


0.72 


0.93 


0.51 




Lengthwise 


5 


0.86 


0.80 


0.63 


0.81 


0.53 




specimens 


6 


0.88 


1.01 


0.59 


0.79 


0.72 






7 


0.92 


1.14 


0.81 


0.79 


0.67 






8 


0.87 


0.87 


0.65 


0.86 


0.47 







9 


0.93 


0.97 


0.64 


0.84 


0.44 


*0 




10 


0.95 


1.09 


0.75 


0.92 


0.48 







1 


0.89 


0.86 


0.52 


0.86 


0.52 


^ 




2 


0.69 


1.17 


0.52 


1.06 


0.53 






3 


0.46 


1.18 


0.80 


0.81 


0.47 






4 


0.85 


1.32 


0.64 


0.97 


0.47 




Crosswise 


5 


0.73 


1.03 


0.63 


0.90 


0.57 




specimens 


6 


0.67 


0.84 


0.58 


0.93 


0.54 






7 


0.78 


0.89 


0.65 


0.87 


0.56 






8 


0.77 


0.84 


0.60 


0.88 


0.55 






9 


0.80 


1.03 


0.71 


0.89 


0.45 






10 


0.79 


1.06 


0.59 


0.82 


0.60 
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The appropriate breakdown, written symbolically, follows: 
Between types of cut (X c X) 

Among lots (Xi X) 

Among specimens within a lot and within a type of cut, that is, 
experimental error 

(X - X lc ) 

The total variability is of the form (X X) ; we have 

(X - X) = (X c -X) + (X l - X) + (X - X lc ) + R 

from which R is of the form 

(X lc -X c -X l + X) 

a term which will presently be shown to symbolize the interaction or 
joint effect on impact strength of cuts and lots. 



Source of variation 


Sum of squares 


Degrees of freedom 


Mean square 


Between types 
of cut 


0.0454 


1 


0.0454 


Among lots 
Interaction of 


2.7912 


4 


0.6978 


cuts and lots 


0.1417 


4 


0.0354 


Error 


0.8947 


90 


0.0099 


Total 


3.8730 


99 













The cut-lot interaction sum of squares is best calculated by writing the 
following totals: 





Lot number 


I 


II 


III 


IV 


V 


Total 


Cut 


Lengthwise 


9.19 


9.97 


6.90 


8.70 


5.51 


40.27 


Crosswise 


7.43 


10.22 


6.24 


8.99 


5.26 


38.14 


Total 


16.62 


20.19 


13.14 


17.69 


10.77 


78.41 



We have for this table 

- X) 2 = 



- X) 2 



- X) 2 
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+ 



= 0.0454 

= 2.7912 (already calculated) 
Interaction sum of squares = 2.9783 - 0.0454 - 2.7912 = 0.1417 

Notice that the breakdown favored in this example is equivalent to 
the one not favored in the previous example. In the previous example, 
the column headings were roll numbers each of which was without 
meaning when taken over all lots. The number three roll, for instance, 
was the third roll chosen in each lot and if the selection is made at ran- 
dom, there is no reason to expect that the third roll should differ signifi- 
cantly from the others. In the present example, a column comprises 
one lot and lot differences are meaningful. Hence in the present exam- 
ple we are interested in 

- -A ) 



column 

Finally 

0.0454 



00099 



4.59; ^critical (.05) = 3.95 



0.6978 
Flot ** 00099 = 70 ' 49; ^critical (.os) = 2.47 

0.0354 

^interaction r\ ~~ 5 '> r criticai:(.05) 



Both cuts and lots, and their joint effect, are significantly responsible 
for variable quality. 

As a final example, the following Western Electric Company data are 
given by Rider (34, a). They deal with the thickness of coating, in 
0.0001 of an inch, on fibre strips sprayed with varnish. Measurements 
were taken at each of five different points on each of the three strips 
selected from each of five lots. 
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Strip 


Point number 




number 


1 


2 


3 


4 


5 






1 


10 


8 


10 


9 


7 




I 


2 


8 


8 


8 


8 


10 






3 


8 


10 


10 


6 


7 






1 


13 


12 


12 


12 


13 




II 


2 


10 


9 


13 


11 


8 






3 


11 


8 


10 


12 


12 






1 


12 


13 


14 


17 


16 


Lots 


III 


2 


17 


10 


13 


10 


14 






3 


12 


11 


13 


16 


13 






1 


14 


13 


17 


11 


11 




IV 


2 


11 


9 


13 


11 


12 






3 


17 


13 


14 


13 


8 






1 


9 


13 


17 


13 


11 




V 


2 


8 


11 


10 


12 


11 






3 


7 


14 


14 


9 


9 



When analyzing data which come from an experiment designed and 
carried out by someone other than the analyst, one must know what 
variation in the data is random variation. The term we call " error " 
would consist of variability in thickness among points for a given strip 
within a given lot if (a) the points were distributed at random and (6) 
the three strips were, say, consistently of three different kinds. If, 
however, the strips are randomly chosen from the lot while the different 
points refer systematically to certain parts of a strip, i.e., they are not 
randomly selected, the error term would better consist of the estimate 
of variance from the term among-strips. In the present example the 
former is true. The appropriate analysis is, therefore, 

(a) variability among lots (Xi X) 

(b) variability among strips within lots (X 8 i Xj) 

(c) variability among points within strips (X X 8 i) 



Symbolically 

(X - X) = 
from which 



- X) 



(X - 



R 



R = 



The (X %si) term is the experimental error term. 
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Source of variation 


Sum of squares 


Degrees of 
freedom 


Mean square 


Among lots 
Among strips within lots 
Among points within strips 
(error) 


207.92 
49.20 

277.20 


4 
10 

60 


51.98 
4.92 

4.62 


Total 


534.32 


74 













We have 



and 



strips 



4.66 

4.92 
4.62 



= 1.06 



the value 4.66 being the mean square compounded from among strips 
within lots and among points within strips (as in the example on page 
72). The 5 per cent level values are 2.50 and 1.99. Only the lot 
variation is statistically significant. The association of strips with vari- 
ation in coating thickness is not statistically significant. 

2.17 Interaction. To illustrate the meaning of interaction, con- 
sider the two following examples (from Snedecor, 38). 





Column 


Mean 


1 


2 


3 


Row 


1 


1.8 


2.0 


1.4 


1.73 


2 


1.6 


1.8 


1.2 


1.53 


3 


1.3 


1.5 


0.9 


1.23 


Mean 


1.57 


1.77 


1.17 


1.50 



Analysis of variance yields the following table: 



Source of variation 


Sum of squares 


Degrees of 
freedom 


Mean square 


Rows 
Columns 
Interaction of 
rows and columns 


0.38 
0.56 




2 
2 

4 


0.19 
0.28 




Total 


0.94 


8 
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where 

Total 



Rows 



Columns 



The interaction sum of squares 



is given by 



(1.8 - 1.73 - 1.57 + 1.50) 2 
+ (1.6 - 1.53 - 1.57 + 1.50) 2 



Interaction 

- x c - jf r + 1) 2 



each term of which is zero. To appreciate the meaning of zero inter- 
action, notice that in proceeding from the first to the second column 
of the original data, all variates are increased by the same (absolute, not 
percentage) amount (0.2) and from the second to the third column all 
variates are decreased by the same amount (0.6), and similarly for 
rows. Variation among observations from column to column is the same 
regardless of which row is considered, i.e., there is no " interaction " 
between columns and rows. 
As a second example, consider 





Column 


Mean 


1 


2 


3 


Row 


1 


1.6 


2.0 


0.8 


1.47 


2 


1.5 


1.0 


1.9 


1.47 


3 


1.3 


1.4 


1.7 


1.47 


Mean 


1.47 


1.47 


1.47 


1.47 



Analysis of variance yields the following table : 



Source of variation 


Sum of squares 


Degrees of freedom 


Mean square 


Rows 





2 





Columns 





2 





Interaction of 








rows and columns 


1.24 


4 


0.31 


Total 


1.24 


8 
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In this case, all variation is attributable to interaction. As one pro- 
ceeds from column to column, the amount and direction of change of 
the variate is completely dependent on the row; thus from the first to 
the second column the variate increases by 0.4 in the first row, decreases 
by 0.5 in the second, and increases by 0.1 in the third. The algebraic 
sum of these changes is zero. Separately, the suspected " causes " 
(rows and columns) are responsible for none of the variability; operating 
jointly they are responsible for all of the variability. 

Practical problems yield interactions somewhere between the extremes 
shown in these two examples. Finally, when more than two " causes " 
are under investigation (as in the following example) more complex inter- 
action terms will be produced; these may be interpreted analogously to 
the above. 

2.18 Formal analysis of variance. Data can be classified with 
respect to any number of factors (causes). In the few published 
examples dealing with four or more factors, certain effects are a priori 
considered unimportant and the analyst therefore uses a plan of anal- 
ysis appropriate to his data but one which is rarely useful elsewhere. 





Pot 1 


Pot 2 




Journey 


Cylinder 


Cylinder 


3 


10 


16 


3 


10 


16' 






1 


47 


56 


100 


52 


61 


88 






2 


55 


89 


93 


49 


62 


97 




I 


3 


35 


57 


56 


34 


60 


72 






4 


78 


67 


113 


47 


93 


118 






5 


33 


40 


128 


16 


29 


130 






1 


52 


66 


36 


65 


80 


40 






2 


21 


61 


49 


122 


97 


79 




II 


3 


31 


39 


25 


45 


54 


72 






4 


43 


72 


52 


109 


120 


80 


Run 




5 


37 


51 


67 


67 


85 


63 






1 


50 


61 


60 


75 


139 


130 






2 


33 


27 


49 


46 


58 


63 




III 


3 


24 


39 


24 


15 


33 


39 






4 


18 


18 


43 


22 


16 


19 






5 


28 


42 


28 


27 


19 


22 






1 


24 


34 


43 


46 


66 


24 






2 


24 


49 


42 


40 


117 


105 




IV 


3 


21 


21 


51 


30 


28 


34 






4 


21 


69 


48 


36 


64 


53 






5 


76 


48 


42 


39 


60 


78 
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Experience indicates that students have difficulty following such non- 
systematic procedures. 

Examples involving multifold classification can always first be an- 
alyzed formally (systematically) and combining of terms can be left to 
the end. The following is an example of this procedure. 

Tippett (42, a) gives the data in the previous table, from a paper 
by Gould and Hampton (21) on the mean number of seed (defects) 
per unit area of spectacle glass. Four factors (runs, journeys, cylin- 
ders, and pots) may affect the seed count; accordingly the experiment is 
conducted and the data are classified with respect to these factors. 
Cylinders of glass are made in pots, a journey is equivalent to a day, 
and glass made on consecutive days from the same pots constitutes a run. 
Three cylinders of the eighteen made (numbers 3, 10, and 16 in the 
order of manufacture) were studied. 

These data can be broken down in many ways. Tippett uses the 
following breakdown, which yields mean squares having the greatest 
practical interest. 

SOURCE OF VARIATION DEGREES OF FREEDOM 

(a) Within pots 

(1) Between cylinders 16 

(2) Between journeys 32 

(3) Residual within pots 64 

(4) Total 112 

(b) Between pots 

(5) Between runs 3 

(6) Residual between pots 4 

(7) Total 7 

(c) Between cylinders 

(8) Common to all runs 2 

(9) Common to both pots in run less (8) 6 

(10) Specific to pot 8 

(11) Total 16 

(d) Between journeys 

(12) Common to all runs 4 

(13) Common to both pots in run less (12) 12 

(14) Specific to pot 16 

(15) Total 32 

Grand total 119 

The original data provide no completely satisfactory index of experi- 
mental error for there is but one measurement for each combination of 
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suspected causes; it is the practice, particularly in unreplicated agri- 
cultural experimentation, to take a complex (uninterpretable) inter- 
action term (interaction of pots, runs, journeys, and cylinders) as 
experimental error. 
If we have two suspected causes (rows and columns), there are 

C\ + C\ = 3 

terms in a formal breakdown, C\ representing the number of combinations 
of two things taken one at a time, etc. 

c columns 



r rows 



Source of variation 


Sum of squares 


Degrees of freedom 


Among columns 
Among rows 
Interaction 
(columns X rows) 


(*-*) 

E(Xr-X)* 

E(x -x c -x r + x)* 


c - I 
r -1 

(c-l)(r-l) 


Total 


(*-x) 2 


a - 1 



For a three-cause formal breakdown, there are 

C3 i fi3 I /nr3 7 
1 ~r ^2 ~T ^3 

classifications. 

c columns 



g groups 



r rows 
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Source of variation 


Sum of squares 


Degrees of freedom 


Among columns 


E(Xc-X)* 


c - 1 


Among rows 


E(Xr~X)* 


r- 1 


Among groups 


E(x a -Z)* 


0-1 


First-order interaction 
(columns X rows) 


E(Xcr - X c - X r + X) 


( C _l)( r _l) 


First-order interaction 
(columns X groups) 


E(x ca - x, - x a + x)* 


(c- 1)(0-1) 


First-order interaction 
(rows X groups) 


E(X TO - X r - X g + X)* 


(r- 1X0-1) 


Second-order interaction 
(columns X rows 
X groups) 


2^ \X X C r X C g X r y 
+ X c +X r +X g -I) 2 


(c-l)(r-l)(0-l) 


Total 


L(x-x) 2 


erg 1 



For a four- way formal breakdown, we have 

ct + c\ + a + a = 15 

classifications. The numbers in parentheses show the degrees of freedom 
associated with the mean squares for the data of Gould and Hampton. 

MAIN EFFECTS 



Runs 
Pots 

Journeys 
Cylinders 

Runs X pots 
Runs X journeys 
Runs X cylinders 
Pots X journeys 
Pots X cylinders 
Journeys X cylinders 



FIRST-ORDER INTERACTION 

rp - X r - X p + X) 2 



- x p - x,- 

- - X c 



(r-D 
(P-D 
(j - 1) 
(c-1) 

(r- - l)(p - 1) 
(r-DO'-l) 



(p - 1)0' - 1) 
(p - l)(c - 1) 
(j - l)(c - 1) 



(3) 
(1) 
(4) 
(2) 

(3) 
(12) 
(6) 
(4) 
(2) 
(8) 



SECOND-ORDER INTERACTION 
Runs X journeys X cylinders 

Runs X journeys X pots 
V/^ ^ ^ ? . 

/ . (A r j p -A rj -" rp -" ] P 

Journeys X cylinders X pots 

(X jcp - X ic - X ip - X cp + Xj + X c + X p - X) 2 (j - 1) (c - 1) (p - 1) (8) 



X C - X)* (r - 

X p - X)* (r -1)0' - 



(24) 
(12) 



Runs X cylinders 'X pots 

> Xrc Xrp XC P 



X r + X c + X p - X)* (r - 



- 1) (6) 
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THIRD-ORDER INTERACTION 
Runs X journeys X qylinders X pots 

V" 1 /-?? * "P" V^ V" i & i Y 

/ ^ lAr-p'c-p "~" -Ar/c """ **-TJP """" *^rjp \ -A-rj "T" -"-ro 

_l_V. J-Y" J_ ^ -l_ IT _ Y . ^. 
"T-^-jcr^rpr-^jpr-^cp ^r -<>; 

- A c - ^ p + I) 2 (r - 



-l) (24) 



Total 



r/cp 1 



(119) 



It is easier to compute and to appreciate the meanings of these terms if 
one sets down portions of the data in the three-cause form. Instead of 
writing the means in each case, the totals are used, for although we are 
always dealing with means in the form ^X/n in variance analysis, the 
means themselves need seldom be computed; ^X is sufficient and more 
convenient. 



TABLE a 



TABLE b 





Run 




1 


2 


3 


4 




1 


404 


339 


515 


237 


5? 


2 


445 


429 


276 


377 


fl 
(^ 


3 


314 


266 


174 


185 


o 
1-5 


4 


516 


476 


136 


291 




5 


376 


370 


166 


343 



TABLE c 



TABLE e 





Run 


1 


2 


3 


4 


1 


1 

2 


1047 
1008 


702 
1178 


544 
723 


613 
820 





Journey 


1 


2 


3 


4 


5 


I 


1 

2 


629 
866 


592 
935 


423 
516 


642 

777 


620 
635 



* y . = V 
A-rjcp ** 





Run 




1 


2 


3 


4 





3 


446 


592 


338 


357 


.a 


10* 


614 


725 


452 


556 





16 


995 


563 


477 


520 



TABLED 





Journey 




1 


2 


3 


4 


5 


1 


3 


411 


390 


235 


374 


323 


M 


10 


563 


560 


331 


519 


374 





16 


521 


577 


373 


526 


558 



TABLE / 





Cylinder 


3 


10 


16 


I 


1 
2 


751 

982 


1006 
1341 


1149 
1406 
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TABLE g 





Run 1 


Run 2 


Run 3 


Run 4 




Cylinder 


Cylinder 


Cylinder 


Cylinder 




3 


10 


16 


3 


10 


16 


3 


10 


16 


3 


10 


16 




1 


99 


117 


188 


117 


146 


76 


125 


200 


190 


70 


100 


67 


>> 


2 


104 


151 


190 


143 


158 


128 


79 


85 


112 


64 


166 


147 


53 


3 


69 


117 


128 


76 


93 


97 


39 


72 


63 


51 


49 


85 


% 


4 


125 


160 


231 


152 


192 


132 


40 


34 


62 


57 


133 


101 




5 


49 


69 


258 


104 


136 


130 


55 


61 


50 


115 


108 


120 



TABLE h 





Run 1 


Run 2 


Journey 


Journey 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


+3 

< 


1 

2 


203 
201 


237 
208 


148 
166 


258 

258 


201 
175 


154 
185 


131 

298 


95 
171 


167 
309 


155 
215 




Run 3 


Run 4 


Journey 


Journey 


1 


2 


3 


4 


5 


1 


2 


3 


4 


5 


+3 
& 


1 
2 


171 
344 


109 
167 


87 
87 


79 
57 


98 
68 


101 
136 


115 
262 


93 
92 


138 
153 


166 
177 



TABLE i 





Potl 


Pot 2 




Cylinder 


Cylinder 




3 


10 


16 


3 


10 


16 




l 


173 


217 


239 


238 


346 


282 


& 


2 


133 


226 


233 


257 


334 


344 


S 


3 


111 


156 


156 


124 


175 


217 


o 

- 


4 


160 


226 


256 


214 


293 


270 




5 


174 


181 


265 


149 


193 


293 
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TABLE j 





Pot 1 


Pot 2 




Cylinder 


Cylinder 




3 


10 


16 


3 


10 


16 




1 


248 


309 


490 


198 


305 


505 


g 


2 


184 


289 


229 


408 


436 


334 


tf 


3 


153 


187 


204 


185 


265 


273 




4 


166 


221 


226 


191 


335 


294 



Each table need not be completely analyzed, for there is a certain 
amount of duplication. Thus Table a will yield 



Among runs 
Among journeys 
Interaction (runs X journeys) 
and Table c will yield 
Among runs 
Between pots 
Interaction (runs X pots) 



(X r X) 2 
(Xj X) 2 
r j X r 



j + X) 2 



X) 2 
EC3Tp ~ %) 2 



- X r - X 



X) 2 



duplicating " among-runs "; similarly for other tables. The formal 
four-way breakdown of the data is carried out in the table below. 



Term 
number 


Source of variation 


Sum of 
squares 


Degrees of 
freedom 


Mean 
square 


1 


Runs 


13,679.89 


3 


4,559.96 


2 


Journeys 


9,684.00 


4 


2,421.00 


3 


Cylinders 


9,132.87 


2 


4,566.44 


4 


Pots 


5,644.41 


1 


5,644.41 


5 


Runs X journeys 


18,650.07 


12 


1,554.17 


6 


Runs X cylinders 


11,532.73 


6 


1,922.12 


7 


Runs X pots 


4,455.16 


3 


1,485.05 


8 


Journeys X cylinders 


1,992.55 


8 


249.07 


9 


Journeys X pots 


2,727.13 


4 


681.78 


10 


Cylinders X pots 


146.46 


2 


73.23 


11 


Runs X journeys X cylinders 


9,104.18 


24 


379.34 


12 


Runs X journeys X pots 


6,855.46 


12 


571.29 


13 


Journeys X cylinders X pots 


917.12 


8 


114.64 


14 


Runs X cylinders X pots 


1,320.47 


6 


220.08 


15 


Runs X journeys X cylinders X 










pots 


6,384.29 


24 


266.01 




Total 


102,226.79 


119 
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Practical interest would be expected primarily to center on the signifi- 
cance of the variability between pots, among runs, among journeys, 
and among cylinders. With this in mind how can we best classify the 
above fifteen terms? Note that an interaction term, such as, say term 9, 
can be classified beneath either of two headings variability among 
journeys or between pots. " Between pots " has no interest, for 
the pots were not used in any particular sequence. Hence terms such 
as 9, involving pots and another factor, should be eventually listed under 
the other factor. In a similar way, " between runs " means little, 
for the runs are quite independent of each other. Finally, there is no 
question as to how the main effects (terms 1 to 4) are to be classified as 
only one factor is involved in each. 

We obtain the following table: 

DEGREES OF 
SOURCE OP VARIATION FREEDOM" 

Among runs 

Runs (term 1) 3 

Between pots 

Pots (term 4) 1 

Among journeys 

Journeys (term 2) 4 

Runs X journeys (term 5) 12 

Journeys X pots (term 9) 4 

Journeys X pots X runs (term 12) 12 

Among cylinders 

Cylinders (term 3) 2 

Runs X cylinders (term 6) 6 

Cylinders X pots (term 10) 2 

Runs X cylinders X pots (term 14) 6 

The following terms are not automatically placed by the criterion of 
practical interest (run and pot variation of no interest) : 

DEGREES OF 

FREEDOM 

Term 7 (runs X pots) 3 

Term 8 (journeys X cylinder) 8 

Term 11 (runs X journeys X cylinders) 24 

Term 13 (journeys X cylinders X pots) 8 

Term 15 (runs X journeys X cylinders X pots) 24 

Of these the first will be arbitrarily classified as a between-pot term. 
The others are all journey X cylinder terms and constitute a residual 
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interaction, as will be seen from the following addition. 

Term 8 -( 
Term 11 (>-( 




Term 15 



The encircled terms cancel out, leaving 

Y. _ Y Y 

<<*]cpr SLjpr -A-cp 



4- Y 
i -A-p 



which is the interaction term of Xj pr and X cprj i.e., the joint effect on 
seed of journeys and cylinders, the effect differing from one pot-run to 
another. This is a secondary or residual effect and is classified as such. 



Source of variation 


Sum of 
squares 


Degrees of 
freedom 


Mean 
square 


Among journeys 
Overall journey 
Runs X journey 
By pot (9 -f 12) 


9,684.00 
18,650.07 
9,582.59 


4 
12 
16 


2,421.00 
1,554.17 
598.91 


Among cylinders 
Overall cylinder 
Runs X cylinder 
By pot (10 + 14) 


9,132.87 
11,532.73 
1,466.93 


2 
6 

8 


4,566.44 
1,922.12 
183.37 


Between pots 
Pots 
By runs 


5,644.41 
4,455.16 


1 
3 


5,644.41 
1,485.05 


Among runs 


13,679.89 


3 


4,559.96 


Residual 
(8 + 11 + 13 + 15) 


18,398.14 


64 


287.47 


Total 


102,226.79 


119 





The accompanying table is equivalent to the table set down by Tippett. 
Analysis of the mean squares may now be carried out, and the reader is 
referred to Tippett (42, a) for further discussion and final conclusions. 

2.19 The L tests. It has been noted that a test of the homogeneity 
of variances (Li) should precede the F test if the F test is to be con- 
strued as a test of the homogeneity of means. The I/i test was used in 
the first chapter; we shall now illustrate this test in greater detail. 
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In addition a new preliminary test known as the L test will be illus- 
trated. The three tests L , LI, and F are most informative when used 
together, as will be shown presently. 

Rider (34, a) gives the following Western Electric Company data on 
the breaking strength in pounds tension of cement briquettes. 



Batches 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


518 


508 


554 


555 


536 


544 


578 


530 


590 


542 


560 


574 


598 


567 


492 


502 


532 


564 


554 


556 


538 


528 


579 


550 


528 


548 


562 


536 


530 


590 


510 


534 


538 


535 


572 


562 


524 


540 


572 


546 


544 


538 


544 


540 


506 


534 


548 


530 


525 


522 



There are 50 observations, divided among 10 samples, each sample 
having 5 observations. The notation will be k = 10, n = 5, and N = 50. 
The questions which the L , LI, and F tests can answer are: 

(1) Could these 10 samples belong to normal populations having the 
same mean and the same variance? (L test.) 

(2) Could these 10 samples belong to normal populations of the same 
variance, no stipulations being made as to the mean? (Li test.) 

(3) Could these 10 samples belong to normal populations whose means 
are appreciably the same and whose variances are assumed the same? 
(Ftest.) 

The functions devised by Neyman and Pearson (30) are respectively: 

i 

[5] Lo 



[6] 
where 



(2 2 
*1 *2 ' 
2 > 2 . . 

/ 2 2 2\ 

/ Si S% Sk\ 

1 = V^^---J 



Nsl 



Ll 

nk 



and N = nk 



sf are the within-sample variances, SQ is the variance based on the devi- 
ation of all N observations about their mean X, and s% is the mean of the 
within-sample variances. X t - are the sample means. For theoretical 
reasons, only the case of equal sample (batch) sizes 



= n 2 = 



= n 



can be considered. 



88 INDUSTRIAL STATISTICS 

In both the L and LI tests, if the hypothesis that the data do belong 
to the specified normal population is true, the value of L tends to unity, 
although the occurrence of unity will be a highly unlikely event even if 
the hypothesis is true, for L is subject to sampling error. The less the 
data support the hypothesis, the nearer for given n will the value of the 
corresponding L come to zero. 

The distributions of LO and L\ have been approximated by Neyman 
and Pearson (30) and tables have been prepared by Mahalanobis 
(26) (Tables IX and X) and by Nayer (28). 

In computing L and L b we introduce the geometric mean s 2 of the 
within-sample variances sf . 



log s, = - (log s? + log s| H ---- + log si) 

K 

Then 

log L = log s* - log SQ 

log LI = log Sg log Sa 

WITHIN-SAMPLE VARIANCE 
SAMPLE S| 

1 324.80 

2 459.84 

3 509.44 

4 127.44 

5 754.56 

6 404.80 

7 384.96 

8 158.40 

9 607.36 
10 498.56 

so = 528.96 

s = 423.02 
S 2 = 374,75 

from which 

LO = 0.708 

Li = 0.886 



DIFFERENCES AMONG SEVERAL MEANS 



89 



For n = 5 and k = 10, the 5 per cent level of L (Table IX) is 0.4857. 
We have L = 0.708. The 10 samples are homogeneous with respect 
to mean and variance. 

As the test indicates that the samples came from the same normal 
population (i.e., from normal populations of the same mean and the 
same variance) the Z/i and F tests must both fail (i.e., show no signifi- 
cance) for they test the nature of any non-homogeneity disclosed by the 
LO test. From Table X, the 5 per cent level of L\ is 0.6318, indicating 
that the L\ hypothesis is upheld; finally, the following analysis of 
variance shows that the samples are homogeneous in their means, i.e., 
the null F hypothesis is upheld. 



Source of 
variation 


Sum of 
squares 


Degrees of 
freedom 


Mean square 


Among batches 
Within batches 


5,297.22 
21,150 80 


9 

40 


588 58 
528 77 


Total 


26,448.02 


49 






The following statistics have been computed from data and calcula- 
tions given by Budding and Baker (10). The original data deal with 
the breaking strain of glass tubing. Each sample contains eight obser- 
vations. 



Sample 


Mean 


Within-sample variance 
? 


1 


1010 


4,025 


2 


1100 


38,013 


3 


1020 


38,488 


4 


1100 


6,700 


5 


1070 


34,300 


6 


1180 


15,475 


7 


1030 


12,375 


8 


1180 


43,100 


9 


1040 


14,775 


10 


1200 


11,238 


11 


970 


38,088 


12 


1050 


50,825 


13 


840 


58,675 


14 


970 


32,588 


15 


1060 


13,413 


16 


1130 


6,838 



90 
We find 
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_. 34^101 (see following analysis of variance) 



128 

s 2 a = 26,182 

log s 2 g = 4.2999 

from which 

Lo = 0.58 

Li = 0.76 

The 5 per cent levels are, approximately 

Lo = 0.62 
L! = 0.73 

The L hypothesis is not supported, i.e., the samples differ significantly 
in their means and/or variances. The result of the LI test indicates 
that the samples do not differ significantly in their variances; hence 
they must differ in their means, and an analysis of variance should 
support this expectation. 



Source of variation 


Sum of squares 


Degrees of 
freedom 


Mean square 


Among samples 
Within samples (error) 


1,013,550 
3,351,328 


15 
112 


67,570 
29,923 


Total 


4,364,878 


127 













From the above table 



F = 2.26 



which for 15 and 112 degrees of freedom is significant. 

A final example is based on data given by Campbell and Lovell (6) 
on octane ratings of motor fuels. The fuels, which are of known compo- 
sition, are rated blind eight successive times in eight different makes of 
car. The results are shown in the two following tables. 
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63.3 OCTANE NUMBER FUEL 





Rating number 


1 


2 


3 


4 


5 


6 


7 


8 




A 


63.8 


65.7 


65.7 


65.7 


63.8 


67.6 


67.6 


65.8 




B 


68.8 


55.0 


56.3 


61.2 


61.2 


61.2 


61.2 


61.2 




C 


60.0 


60.0 


60.0 


63.8 


56.3 


67.6 


67.6 


72.8 




D 


58.7 


62.2 


62.2 


63.8 


62.2 


62.2 


62.2 


62.2 


Par 






















E 


60.0 


63.8 


63.8 


63.8 


60.0 


60.0 


63.8 


63.8 




F 


60.2 


58.2 


63.8 


63.8 


63.8 


63.8 


63.8 


63.8 




G 


63.8 


62.5 


64.8 


64.8 


65.8 


65.8 


63.8 


65.8 




H 


63.8 


63.8 


64.8 


63.8 


63.8 


63.8 


65.5 


65.5 



75.0 OCTANE NUMBER FUEL 





Rating number 


1 


2 


3 


4 


5 


6 


7 


8 




A 


75.0 


75.0 


73.0 


73.0 


77.0 


76.0 


76.0 


76.0 




B 


75.0 


75.0 


75.0 


71.4 


75.0 


75.0 


75.0 


77.0 




C 


75.0 


75.0 


75.0 


75.0 


75.0 


75.0 


74.3 


75.0 




D 


75.0 


75.0 


77.0 


77.0 


77.0 


77.0 


77.0 


77.0 


Par 






















E 


75.0 


75.0 


75.0 


73.3 


75.0 


77.0 


77.0 


74.0 




F 


75.1 


75.1 


75.1 


75.1 


75.1 


75.1 


75.1 


78.0 




G 


73.0 


75.0 


75.0 


75.0 


75.0 


75.0 


75.0 


75.0 




H 


75.0 


77.0 


75.0 


72.3 


77.0 


75.0 


75.0 


77.0 



One of the author's discoveries from these data is that at light knock 
intensity (75.0 octane number) the variation in knock rating appears less 
(the standard deviation is 1.2) than for the heavier knock intensity 
(standard deviation is 3.0). To this they add " No particular signifi- 
cance is attached to the variations of the standard deviations from car 
to car * * * because of the relatively small amount of data available." 

Let us examine this opinion. We find LI values of 0.48 and 0.65. 
The 5 per cent level of LI for both fuels is about 0.71. Hence for both 
grades of fuel the variation in s from car to car is statistically significant. 

NOTES 

2.20 Cross-product terms in the analysis of variance. In setting the total 
sum of squares equal to the sum of sums of squares associated with specific 
factors, certain cross-product terms were assumed to be zero. It is easy to 
show that they are zero. Thus for the breakdown shown on page 70 

- xy = Ltd* - x) + (x H - Xi)_+ (x - i r oi 2 

- X)* + (,,, - Xtf + (X - X rl Y 

- X) + 2'(X l - X)(X - Xri) 



92 INDUSTRIAL STATISTICS 

Omitting the common multiplier 2, the cross-product terms may be written 



The first term is an abbreviated statement of 



- X) + (X r ^ - X) 



The expression outside each bracket is constant for all terms within that 
bracket. The terms within each bracket represent the sum of the deviations of 
variates around their own mean; hence each bracket is zero. 

Similarly, each of the cross-product terms for the breakdown favored on 
page 73 is zero. We have 



- X )*= [(- X) + (Xi- X) + (X lc - Xi- X c + X) + (X-X lc )} 2 
(Xi- S) J + Z(2j- Xt- X c + AT+LCX-lic) 2 
i- X) + 2(X C - X) (X lc - I/- X,+ X) 

- X)(X- X lc ) + 2(Xi- X) (X lf - Xi- X C +X) 

- X)(X- X lc ) + 2Z(Xic- X t - X c + X) (X- X lc ) 



By the arguments used in the preceding example the third, fifth, and sixth 
cross-product terms are zero. Omitting the factor 2, the remaining cross- 
products may be written 

- x c ) - Z(, - l)(z z - x) 



As in the previous example the second and fourth terms are zero. The 
other term may be expanded with one term in parentheses outside the summa- 
tion. The term within the summation is the sum of the deviation of variates 
around their means and hence is zero. 

2.21 Distribution of F. The argument underlying the comparison of an 
index of variation due to a suspected " cause " with a similar index associated 
with unknown (= chance) causes is the following: 

If an indefinitely large number of random samples are drawn from a known 
population, a sample statistic (such as the mean or variance) will have a 
continuous distribution curve which can often be exactly determined by 
mathematical procedure rather than be approximated by any amount of 
experimental sampling. For example, if an indefinitely large number of 
random samples each of n independent observations are drawn from a normal 
population of variance <7 2 , and if for each sample the statistic 
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is computed, the distribution of x 2 , with n 1 degrees of freedom, is 

<7 x 2(n-3)/2 e -^/2 dx 2 

where C is a constant. 

Similarly, if we have two unbiased estimates a x and <r v of the variance 
a- of a normal population where 

,2 



f x and /y being the number of degrees of freedom on which each estimate is 
based, the ordinate of the distribution of the ratio 



is found to be 



\l/ being known. This distribution is found in the following way: The dis- 
tribution of <r 2 (for a given value of <rl) is known. Similarly for a- 2 . The 
distribution of their ratio F is found by multiplying the distribution function 
of a 2 (for a given a- 2 ) by the distribution function of a- 2 / and integrating the 
product over all values of v y (0 to oo ). Table VIII gives values of F, for vari- 
ous /i and/2, beyond which lies 5 per cent (and 1 per cent) of the area under the 
curve of F, i.e., values of F satisfying the equation 

0.05 
0.01 

To summarize: we compute the ratio F from the data. We then determine 
the probability in random sampling from a normal population that the com- 
puted ratio would be exceeded. If this probability is small (0.05 or 0.01) we 
conclude that the mean square in the numerator of F is significantly greater 
than the true estimate of or 2 furnished by the denominator; the mean squares 
could not have arisen from the same normal population variation associated 
with that cause is statistically significant. 

2.22 Estimating <r 2 . We now outline a proof that the mean squares given 
in the last column of each analysis of variance table are unbiased estimates 
of the population variance o* 2 . 

The method has been used by Irwin (22). Let N observations Xu be 
divided among R rows and C columns, RC = N. 
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COLUMN 



Row 



C 



R 



Source of 
variation 


Sum of squares 


Degrees of 
freedom 


Mean square 




WF Y\% 


T> 1 


(J? r -X) 2 


Rows 


2^\-&r <&) 

Wv v^2 


K 1 
C-l 


R 1 

V"* / ip- j?"\2 


Columns 


Z^v^-c *~ JS-) 

V (V Y VI YA2 


1 

/p 1) f (7 1) 


C - 1 


Interaction 


/ j ^W " -^i- T * * C I ^*- / 




(R - l)(C - 1) 


r P / - v 4- Q! 


- 2 




x ^ / Y Y\ 2 
/ ^ \ A ~~* A y 


lotai 


/ ^ (^ A A ) 




/?(7 - 1 



It has already been shown that 



- 1 



is an unbiased estimate of or 2 , based on RC 1 degrees of freedom. Now for 
the variance due to rows; write the expected value (mean value over all 
samples) of (X r A") 2 as follows: 

fl c f 
[7] E[(X r - X) 2 ] = E - E (X - Z') - (X - I 7 ) 



Expanding the right-hand side and writing down the expected values of the 
resulting three functions, for example, 

ECL(X - I') 2 ] = * 2 
we find 

E[E(Xr - X) 2 ] = (7 2 (r - 1) 
or 



R - 
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is an unbiased estimate of a 2 based on R 1 degrees of freedom. 

X is normally distributed, hence the mean X r is also normally distributed; 
the distribution of 

E(Xr ~ X)* 

R- 1 

is essentially that of s 2 . 
Similarly, 

- xy 



c - 1 

is an unbiased estimate of <r 2 based on c 1 degrees of freedom. 
In the case of the sum of squares 

E(x - x r - x c + xy 

we have 

E(X -X r -X c + I) 2 = 

E[(X - X') - (X, - X')- (X c - X') + (X - X')] 2 

After expansion and use of [7], we find 

E(E(X - X r -X c + X ) 2 ] = (R - 
or 

- X r - X c + X) 2 



(R - !)(<? - 1) 
is an unbiased estimate of tr 2 based on (R 1)(C 1) degrees of freedom. 



CHAPTER III 
RELATIONSHIP AMONG VARIABLES 

3.1 Introduction. In several sciences, for example, physics, relation- 
ships among variables are often stated in exact functional form. Thus 
the relationship between time and distance for an object falling in a 
vacuum is written simply as s = ^gt 2 , and it is implied that s is exactly 
determinablc from t. 

Without questioning the validity of this practice in physics, it is 
apparent that it is not valid in industrial research. Both the nature of 
industrial experimenting and the impracticability of duplicating the rela- 
tively controlled conditions of physics laboratories bring about this 
result. The hardness and tensile strength of one aluminum casting 
may be respectively X and F, while for a second apparently identical 
specimen we find X and 1.2F. Specification of Y from X is subject to 
error, i.e., from knowledge of X we can estimate only the average 
(expected) value of F, not the value which is actually observed. 

3.2 Types of relationships. Such relationships among variables can 
be classified. If there are two variables whose relationship is described 
by a straight line, the term linear regression is used to describe the rela- 
tionship. If the relationship is parabolic, say F = aX + bX 2 where 
a and 6 are constants, the term curvilinear regression is used. For a 
relationship among more than two variables, such as F = aU + bV, 
where a and 6 are constants, the term multiple regression is used. The 
relationship among any k of n related variables, the remaining n k 
variables being in the simplest case, held constant, is described as partial 
correlation or partial regression. We shall presently illustrate simple 
linear and simple multiple regression. 

3.3 Uses of regression analysis. Regression analysis is useful 
wherever hypotheses dealing with relationships are examined. To give 
a few examples: in agriculture the relationship between crop size and 
tree injury has been studied; in medicine, studies of the relationship 
between vitamin potency and weight gain have successfully used regres- 
sion analysis; in economics, and other social sciences where perhaps 
serious questions regarding the validity of the technique can be raised, 
the use of regression analysis has been extensive and there seems to be 
hardly any group of variables to which simple, multiple, and partial cor- 
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relation and regression analysis has not been applied. In industrial 
research regression analysis can be used in the search for inexpensive 
methods of testing as replacements for more expensive methods. We 
shall give several examples of 
this usage. 

3.4 General procedure. 
Let it be presumed that two 
variables are linearly related. 
We construct the best fitting 
straight line. 



Y r = a + bX 



x 



The total variation of say the F-value of an observation about its 
mean F can be represented by 

(Y-Y) 

where F represents an individual observation. This may be divided 
into two parts : first, a part explained by the relationship of Y to X, and 
second, any remainder. 

Consider the first part. If F and X are unrelated, the expected or 
best value of F, given X, is F. If F and X are related, the best esti- 
mate of F, given X is Y r , the ordinate of the regression line at X. The 
greater the relationship between the two variables F and X, the greater 
the superiority of Y r over F as an estimate of F, for given X. Hence 
the term 



represents that part of the total variation allocable to regression. 

The remainder consists of variation unallocable to regression, and 
inasmuch as there are no other specific factors to which such variation 
can be allocated, this term is of purely residual character. It consists 
of the variation of observations about the regression line and is given by 



In measuring these three classes of variation, it might be supposed 
that we use 

~ Y) = L(F r - F) + (F - Y r ) 



the summation extending over all n values of F or F r . This equation is 
valid but not useful, for XX F F) is always zero regardless of the 
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amount of the variation of Yi about F. If, however, variation is meas- 
ured by the squares of deviations we shall have 



E(F - Y) 2 = L(F r - F) 2 



- F r ) 2 



as will presently be proved. 

To obtain three estimates of the population variance from these sums 
of squares, it is necessary to introduce, as before, the idea of degrees of 
freedom. There are n values of Y in )(F F) 2 less the grand mean 
which is computed from F, or n 1 degrees of freedom. There are n 
values of F in (F F r ) 2 less the computed constants of Y r (a and 
6), or n 2 degrees of freedom. By subtraction there remains only 
one degree of freedom for the linear regression estimate based on 
(F r F) 2 , which is as it should be for the two constants of the regres- 
sion line (a and 6) are restricted by F. 



Source of variation 


Sum of squares 


Degrees 
of freedom 


Mean square 


Linear regression 
Residual 


E(Fr-F) 2 
Z(F-F r ) 2 


1 




. 2 Zuv-P) 1 


1 

-2 E(Y - Fr) 2 




"" n-2 


Total 


Z(F-P) 2 


n - 1 





The test of significance is again the F test. If there is no real linear 
relationship between F and X, the two mean squares should be the 
same. If, however, the value of F in 



is sufficiently large, the regression is real. 

Thus if we make many drawings (of size n) from a bowl of chips, 
each chip being marked with two numbers, one being X and one F, 
the distribution of X and F being bivariate normal, with the overall 
correlation between X and F in the bowl being zero, and if from each 
drawing we construct a\ and of , the distribution of their ratio F for 
all drawings of size n could be determined. It is this distribution whose 
values are shown in Table VIII. For example, for n = 20 only 5 per 
cent of such drawings will yield values of F exceeding 4.38 (1 and 19 
degrees of freedom). 
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The results obtained in an ordinary industrial experiment are judged 
by the probabilities given in Table VIII. Thus if we find F = 10 for 1 
and 19 degrees of freedom, we conclude that there is less than 1 chance 
in 100 that 3? and 3$ are estimates of the variance of the same homo- 
geneous normal population. The hypothesis is rejected and we shall 
conclude that the populations are not identical and that the regression 
is real. 

3.5 Fitting the regression line. The criterion we shall use in fitting 
the regression line Y r = a + bX is the following: if one of the two vari- 
ables X and Y (say F) is subject to error while the other is not, the 
sum of the squares of vertical distances from the Fs to the corresponding 
Y r , i.e., (Y F r ) 2 , is to be a minimum. This requirement yields two 
equations which are solved for a and 6. This and other properties of a 
regression line fitted to observations by the method of least squares will 
be developed later. 

3.6 Examples of linear regression. Brenner (4) gives the following 
data on the thickness in hundred-thousandths of an inch of non-mag- 
netic coatings of galvanized zinc on 11 pieces of iron and steel: 

THICKNESS AS MEASURED THICKNESS AS MEASURED 

BY STANDARD DESTRUCTIVE BY NON-DESTRUCTIVE 

STRIPPING METHOD MAGNETIC METHOD 

Y X 

116 105 

132 120 

104 85 

139 121 

114 115 

129 127 

720 630 

174 155 

312 250 

338 310 

465 443 

Measurement of thickness by stripping is accurate but the tests are 
destructive and costly. The magnetic method is less costly. Do the 
data support the belief that we may measure X and use Y r as an esti- 
mate of Y where 

Y r = a + bX 
a and 6 being constants? 
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The following is a r&ume* of the procedure described in 3.4. The 
significance of a straight line 

Y r = a + bX 

fitted to data by the method of least squares, may be tested by deter- 
mining the probability, in random sampling from a normal population, 
that the computed value of 



would be exceeded, where 



.2 



1 



is the mean square attributable to regression, with 1 degree of freedom, 
and 



n - 2 

the residual or chance mean square, not accounted for by regression, 
with n 2 degrees of freedom. 

y is the mean of Y and n is the number of pairs of observations. 

The y values appear not to have come from a normal population but 
experimental evidence on this point indicates that the F test can 
probably be used. The points are plotted in the following diagram. 



800- 

700- 

r 

600- 
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400- 
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Using the method of least squares to determine a and 6, 



nEX 2 - (EX) 2 
nEXY - 



b n^X 2 - (EX) 2 

From the data, 

2,743 

2,461 

= 952,517 
= 1,067,143 
= 852,419 

n = 11 

from which 

a = -1.7948 

b = 1.1226 

Using these values of a and 6, the predicted values of thickness calcu- 
lated from a knowledge of the magnetic readings X are shown in the 
following table. 

PREDICTED VALUES OF TRUE VALUES OP 
THICKNESS THICKNESS 

Y r = a + bX Y 

116.08 116 

132.92 132 

93.64 104 

134.04 139 

127.31 114 

140.78 129 

705.43 720 

172.21 174 

278.86 312 

346.21 338 

495.52 465 

Is the discrepancy between these pairs of values small, from a statisti- 
cal point of view? The answer is found, i.e., the adequacy of a linear 
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equation is determined, by comparing that part of the total variability 
for which the line can account with remaining unaccountable (chance) 
variability. 



Source of variation 


Sum of squares 


Degrees 
of freedom 


Mean square 


Regression 
Residual 


E(F r -f) 2 

E(F- F r ) 2 


1 

n-2 


E(F r -F) 2 /l 
(F-F r ) 2 /n-2 


Total 


L(F-P) 2 


n - 1 





We have 



= 1,067,143 - 



r - P) 2 = 



n 11 

= 383,138.54 

= 1,064,356.41 

= 380,377.41 
from which, by subtraction, 

XX7 - F r ) 2 = 2,761.13 



Source of variation 


Sum of squares 


Degrees of 
freedom 


Mean square 


Regression 
Residual 


380,377.41 
2,761.13 


1 

9 


380,377.41 
306.79 


Total 


383,138.54 


10 





If the regression line is inadequate, the mean square due to regression 
will not be significantly larger than the residual or chance mean square. 
In our case, the ratio is 

_ 380,377.41 ^^ 

F - -- - 124 



Table VIII gives the values of F which, with one and nine degrees of 
freedom, are necessary in order to establish regression as (1) significant 
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(2) highly significant. F need be only 10.56 for the regression to be 
highly significant. Since we have F = 1240, our estimates of thickness 
from magnetic measurements by use of the equation 

Y r = -1.7948 + 1.1226X 

are statistically sound. The practical man must now decide if the dis- 
crepancies between F and Y r are sufficiently small, from the point of 
view of the use to which the product is put. 

Jennett and Budding (24) give the following 11 observations on life 
tests of electric light bulbs and tests on filament wire. 

LIFE OF BULB QUALITY TEST 

IN HOURS ON FILAMENT 

Y X 

1,605 276 

1,120 293 

1,320 288 

1,225 315 

1,055 305 

1,390 315 

1,385 306 

1,700 286 

2,070 289 

1,395 296 

1,105 335 

A life test required about 1000 hours and cost about $5 per bulb. 
The wire test is quickly performed and is lower in cost. If only wire 
tests are made, can life be estimated from 

Y r = a + bX 
As before, we have 

= 15,370 



= 3,304 
= 4,588,135 
= 995,238 
n = 11 

from which a = 4,410.296 and 6 = -10.031. 
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Source of variation 


Sum of squares 


Degrees of 
freedom 


Mean square 


Regression 
Residual 


285,429.57 
617,238.62 


1 
9 


285,427.83 
68,582.01 


Total 


902,668.19 


10 





Finally 



F = 4.1 



From Table VIII, with one and nine degrees of freedom F need be as 
high as 10.56 in order for the linear regression to be highly significant 
(1 per cent level) or 5.12 for significance at the customary level (5 per 
cent). F = 4.1 does not meet these requirements. Linear regression 
does not account for a sufficient part of the total variability; it is not 
adequate for the purpose of prediction. The residual variability about 
the regression line 



n - 2 
is too large. Values of Y r are calculated from 

Y r = 4,410.296 - 10.03 IX 
and the following table shows Y r and Y: 



PREDICTED VALUES 
OF LIFE 

Y r 

1,641.7 
1,471.2 
1,521.4 
1,250.5 
1,350.8 
1,250.5 
1,340.8 
1,541.4 
1,511.3 
1,441.1 
1,049.9 



ACTUAL LIFE 

TESTS 

Y 

1,605 
1,120 
1,320 
1,225 
1,055 
1,390 
1,385 
1,700 
2,070 
1,395 
1,105 



The correspondence of Y r and Y is not sufficiently high. 

It is impossible to state flatly that a sample of 11 is too small but 
samples perhaps of 40 or more observations are desirable if tentative 
conclusions of industrial importance are to be drawn from the results. 
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As an example from chemical research work, consider the following 
data given by Thomsen (41); we wish to predict titer values from the 
iodine values of fatty acids. 



F 

Titer 
value 
(minus 40) 


X 

Iodine 
value 
(minus 40) 


F 

Titer 
value 
(minus 40) 


X 

Iodine 
value 
(minus 40) 


2.5 


7.2 


2.0 


14.2 


2.0 


10.3 


1.8 


10.3 


3.0 


7.9 


5.4 


0.5 


3.2 


8.2 


3.1 


9.4 


2.1 


13.7 


4.8 


1.4 


4.8 


1.9 


3.7 


5.9 


2.1 


13.0 


0.1 


16.4 


1.5 


17.5 


1.3 


17.3 


4.8 


0.3 


4.8 


0.8 


3.8 


7.3 


1.3 


16.4 


2.2 


12.1 


0.0 


12.2 


0.4 


18.5 


5.0 


2.5 


6.6 


-1.3 


2.4 


13.4 


4.3 


1.9 


5.7 


0.3 


2.0 


13.8 


3.9 


4.1 


1.2 


15.1 


2.1 


11.4 


2.3 


8.8 


0.3 


21.7 


3.2 


6.3 


3.5 


5.7 


1.2 


15.4 


4.3 


3.5 


5.3 


-0.2 


4.5 


3.8 


4.0 


0.7 


2.2 


11.1 


2.9 


9.1 


3.4 


7.6 


3.0 


8.8 


2.3 


2.8 


2.9 


9.1 


4.9 


1.9 


3.1 


8.8 


1.7 


17.8 



We have 



= 148.9 
= 426.6 
= 853.37 
= 561.77 
X* = 5,387.08 
n = 50 
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The accompanying table shows the analysis of variance. 



Source of variation 


Sum of squares 


Degrees of 
freedom 


Mean square 


Regression 
Residual 


99.5407 
18.8051 


1 

48 


99.5407 
0.3918 


Total 


118.3458 


49 





Forty has been subtracted from the observed values of both X and Y; 
this facilitates computation, for the smaller numbers are easier to handle 
and it does not affect the numerical analysis. 

We have F = 254.08. The 1 per cent value of F is 7.19; the regres- 
sion (which is negative, i.e., high values of one variable are generally 
associated with low values of the other variable and vice versa) is 
highly significant. 

3.7 Linear regression in grouped data. In the previous examples we 
had at most 50 observations, so there was no reason to group the obser- 
vations. In the present example there were originally 440 observations; 
they have been grouped into classes in the table below. In such a 
case the analysis is slightly more complex. 

It is here assumed that the variances of all eight columns are 
equivalent, within the limits of chance variation. This assumption, 
which may be checked by the LI test, must be met for the F test to 
be valid. 

The British Cotton Industry Research Association (5) gives the 
following data on the frequency of warp breaks in weaving, classi- 
fied according to values of an important influence, namely, relative 
humidity. 

RELATIVE HUMIDITY (X) 





68- 


70- 


72- 


74- 


76- 


78- 


80- 


82- 


Total 




0.0- 








2 


5 


5 


1 




13 


T 


0.8- 




1 


7 


13 


28 


28 


11 


4 


92 


' 


1.6- 


2 


6 


16 


27 


44 


35 


9 


1 


140 





2.4- 


1 


5 


24 


24 


27 


17 


3 


2 


103 





3.2- 




2 


16 


6 


15 


9 


2 


1 


51 


"P, 


4.0- 


2 


1 


4 


7 


7 


5 


1 




27 


a 


4.8- 




1 


2 


2 


3 


2 






10 


p 


5.6- 






1 






2 






3 




6.4- 












1 






1 




Total 


5 


16 


70 


81 


129 


104 


27 


8 


440 
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Would the regression line 

Y r = a + bX 

enable us effectively to predict warp breakage, or is the residual vari- 
ability too great? 
We shall first proceed as before. 



1,063.2 
33,446 
80,524.4 
3,110.4 
2,545,724 
440 
a = 9.028 
6 = -0.087 
The analysis of variance is given in the following table. 



n 



from which 



Source of variation 


Sum of squares 


Degrees of 
freedom 


Mean square 


Regression 
Residual 


25.51 
515.81 


1 

438 


25.51 
1.18 


Total 


541.32 


439 





An F test indicates significant regression. But the sum of squares 
not due to regression, ZXF F r ) 2 , consists of two parts the sum 
of squares of the deviations of the column means Y c about the 
corresponding F r , i.e., ^(Y c F r ) 2 plus the variability within 
columns S(r F c ) 2 , which is more truly the chance or unallocable 
variability. 

The total residual degrees of freedom were n 2. The unallocable 
part (F F c ) 2 uses k means computed from the data: hence the 

ZXF F ) 2 

unallocable mean square is r-^-~ . By subtraction, the devia- 

n K 

tion-from-regression mean square has k 2 degrees of freedom. 
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Thus for linear regressions calculated from grouped data, the follow- 
ing subsidiary variance analysis may be useful. 



Source of variation 


Sum of squares 


Degrees of 
freedom 


Mean square 


Deviation of means 
from regression 

Unallocable part of 
residual (chance) 


(Pc-F r ) 2 

EO^-Pe) 2 


k -2 

n - k 


ECFc-Fr) 2 


k -2 

(F-F C )* 


n-k 


Residual 


(r - n) 2 


n -2 





We may compute the " chance" sum of squares and then determine 
the deviation-from-regression sum of squares by subtraction. 

It should be observed that all summations extend over the entire data. 
Hence, each mean must be counted as many times as there are observa- 
tions from which that mean was computed. This was discussed in 
Section 2. 10. 



Source of variation 


Sum of squares 


Degrees of 
freedom 


Mean square 


Deviation of means 
from regression 
Unallocable part of 
residual (chance) 


3.37 
512.44 


6 
432 


0.57 
1.19 


Residual 


515.81 


438 





The earlier F test 



F = 



25.51 
1.18 



for 1 and 438 degrees of freedom showed the regression to be highly 
significant. The present value of F is the ratio of 25.51 to 1.19, for 1 
and 432 degrees of freedom and the conclusion originally reached is 
shown to be valid. In the present example the subsidiary analysis of 
variance was uninformative. If, however, the original test showed the 
regression not to be significant, any reduction of the residual mean square 
by elimination of the (possibly) allocable element (deviation from regres- 
sion) might show the regression to be significant, and the latter infer- 
ence would be the proper one. 
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NOTES 

3.8 Breakdown of (Y 7) 2 . The accompanying graph may be helpful. 
It is apparent from the graph that an observation Y may always be written 

Y = P + (Y r - P) + (Y - Y r ) 

the first term to the right of the equality sign (F) being common to all obser- 
vations, the second term (Y r P) representing that part of the value of the 
observation Y attributable to regression, and the last term (Y Y r ) repre- 




senting the unallocable (and therefore dealt with as chance) variability about 
the line of regression. The total deviation (Y Y) thus consists of two 
parts, regression (Y r P) and residual (Y Y r ). 

(Y - Y) = (Y r - Y) + (Y - Y r ) 

Summing this linear expression merely leads to zero = zero. We must show that 
[1] E(F - P) 2 = L(F r - P) 2 + Z(7 - Yr) 2 

We have 

[2] L(F - P) 2 = L[(F ~ Y r ) + (Y r - P)] 2 

= H(Y - Y r ) 2 + Z(F r - P) 2 + 2L(F - Y r )(Y r - P) 

We now show that if the regression line 

Y r = a + bX 

is fitted to the data by the method of least squares, the cross product term of 
[2] is zero and [1] is valid. 

3.9 The method of least squares. In the method of least squares, the 
function Y r a + bX is fitted to the data so that 



110 INDUSTRIAL STATISTICS 

which will be designated by ^>, is minimum. The necessary conditions are 



da 



2(F - a - 6Z) 2 = 
do 



Differentiation yields 

=0 



[3] 

- a - 6X)Z == 



It is apparent that dV/da 2 and d~<p/db~ are always positive, i.e., [3] are con- 
ditions for minimum (p. [3] may be written 



These " normal " equations may be solved simultaneously for a and &, the 
F-axis intercept and the slope of the regression line. 
We return to the cross-product term of [2], which may be written 



- Y r )Y r - FE(F~ Y r ) 

- Y r )(a + bX) - YE(Y - Y r ) 
[4] = 



From [3] it is clear that [4] is 0. Hence [1] is valid. 

3.10 Curvilinear regression. If a curvilinear regression function, say 
the parabola 

Y T = a + bX + 



(with two degrees of freedom) or an m-degree function (with m degrees of 
freedom) is fitted to the data by the method of least squares, the cross-product 
term is easily shown to be 0. The normal equations for determining a, 6, and 
c would be 



The extension to the general case of an mth degree polynomial is obvious. 
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A high degree regression function may fit a set of observations " better " 
than a low degree function, but the error term may be increased by loss of 
degrees of freedom. Lower degree functions, such as the straight line or 
the parabola, are to be preferred for it is best to formalize observed relation- 
ships as simply as possible. For it is often impossible particularly in 
industrial research to rationalize (i.e., to explain by recourse to available 
theory) any but these simpler relationships. Higher-degree regression func- 
tions may lead to new theory, but the use of simpler relationships is in keeping 
with the conservative methodology of experimental science. 

3.11 Degrees of freedom. A basic explanation of the allocation of the 
total number of degrees of freedom is beyond the possibilities of this book, 
but the following may be helpful. The quantity ]C(F ~ F) 2 summed over 
n observations has n 1 degrees of freedom, for the mean F has been calcu- 
lated from the n observations. The regression sum of squares ]C(F r F) 2 
may be written 

E(F r - F) 2 = L(F + b(X - 1) - F) 2 

= 6 2 L(X - I) 2 

X X is independent of any correlation between X and Y. Hence variability 
in 5Z(F r Y) 2 depends on 6; accordingly only one degree of freedom is 
allocated to regression. The residual sum of squares ^(Y Y r ) 2 absorbs the 
remaining n 2 degrees of freedom. 

3.12 Linear regression in grouped data. The accompanying diagram 
refers to the problem of linear regression in grouped data. 




F = P + (Y r - Y) + (Y - Y r ) 
= P + (Y r - F) + (F. - 7) + (F - F.) 



To show that 



[5] 
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write 

(F - F r ) 2 = EKPc - F r ) + (F - Fc)] 2 

= E(Pc - F r ) 2 + E(F - Pc) 2 + 2E(Pc - F,)(F - Fc) 

But 

(F C - F r )(F - PC) 

= LF C (F- P e ) - EF r (F- PC) 

= 0-0 

or [5] is valid. 
3.13 Computation of sums of squares for linear regression analysis. To 

compute (F r Y) 2 we may use the expansion 



for, from the normal equations 



It is, however, not necessary to calculate Y r to find ^(Y r F) 2 for 
(F r - F) 2 = E(o + 6X- F) 2 
~ A') 2 



n J 



Convenient expansions for the other sums of squares are 

- F) 2 = ZF 2 - nF 2 = 
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and 

- r r )(r - Y T ) 



3.14 Regression and prediction (see Eisenhart, 12). The regression line 

Y r = o + bX 
obtained by minimizing 



X being the independent variable, differs from the regression line X r - c + dY 
obtained if 



is minimized. The decision as to which line (or curve) is appropriate (i.e., 
which of the two variables, X and F, is to be considered independent) depends 
not on what we would like to predict, but on which of the two variables, 
X and F, is free from error. If we are studying the relationship between 
quality of output (F) and time (X), the latter will generally be represented by 
values (selected in advance of measurement of F) say, at daily or weekly inter- 
vals; such selected values are free from error. Measurements of F are, how- 
ever, subject to error and the appropriate regression line would be that which 
minimized (F - F r ) 2 , that is 

Y r = a + bX 

This is the linear regression of F on X. It is important to note that in the 
theory of regression, only the dependent variable (F in the above example) 
is required to be normally distributed. The correctness of regression analysis 
is unimpaired by the fact that the X values are arbitrarily selected, for ex- 
ample, uniformly spaced. 

Many problems in industrial research do not lend themselves to a clear-cut 
decision as to which variable is free from error. In the example on titer and 
iodine, neither variable appears to have been selected, and both X and F vary 
normally and one regression line is apparently as good as the other. A better 
solution to this problem has been suggested by Wald (44). 

Occasionally the variable to be estimated, say F may not be subject to error 
whereas X is subject to error. In this case we would first obtain the regression 
of X on F 

X r = c + dY 
and determine F from 

X r - C 



114 



INDUSTRIAL STATISTICS 



3.15 Example of multiple regression. Fulweiler, Stang, and 
Sweetman (18) give the following data on worn wire rope of nominal 
diameter Mi to ^ inch. Can tensile strength be estimated from a 
linear relationship connecting U and 7? 



Tensile 
strength in 


Number 
of 


Length 
of 


Tensile 
strength in 


Number 
of 


Length 
of 


thousands of 


broken 




thousands of 


broken 




pounds per 
square inch 


wires in 
worst lay 


worn 
surface 


pounds per 
square inch 


wires in 
worst lay 


worn 
surface 


Y 


U 


V 


Y 


U 


V 


174 


8 


0.14 


178 


9 


0.14 


185 





0.00 


185 


12 


0.14 


188 


8 


0.12 


172 





0.00 


160 


14 


0.11 


158 


8 


0.11 


179 





0.00 


162 


14 


0.11 


183 





0.11 


176 





0.00 


191 





0.09 


192 


29 


0.13 


177 





0.00 


198 


37 


0.13 


183 





0.12 


158 





0.00 


186 





0.13 


152 


6 


0.10 


180 





0.19 


136 


7 


0.08 


184 


3 


0.15 


174 





0.00 


175 


5 


0.13 


180 


2 


0.15 


175 





0.00 


172 


5 


0.15 


166 


2 


0.16 


172 





0.00 


170 


14 


0.15 


174 





0.13 


180 





0.00 


162 


5 


0.13 


181 





0.12 


165 





0.00 


201 


12 


0.15 


153 


12 


0.11 


172 





0.00 


111 


22 


0.11 


184 


5 


0.16 


173 





0.00 


145 


11 


0.15 


177 





0.09 


172 





0.00 


180 


19 


0.10 


133 


8 


0.11 


172 





0.00 


157 


21 


0.14 


181 





0.10 


175 





0.00 


138 


14 


0.10 



3.16 Normal equations. The theory of the previous chapter applies; 
a multiple regression function of the linear form 

Y r = a + bU + cV 

has two degrees of freedom, i.e., as many degrees of freedom as there are 
independent variables. The residual variance has n 3 degrees of 
freedom, three degrees of freedom being lost by the calculation of a 
(or F), b, or c from the data. If this regression plane is to be fitted to the 
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ungrouped data by the method of least squares, we shall have the 
following normal equations: 

= F 



aF + 6ZFC7 + cF 2 = LFF 

from which a, b, and c may be found. 
We have 

F = 8,907 

Etf = 312 
F = 4.54 
F[7 = 39.05 
Ft7 = 52,335 
FF = 778.41 
= 5,372 



from which 



F 2 = 


0.5926 


n = 


52 


o = 


171.258 


b = 


-0.4133 


c = 


28.750 



The analysis of variance follows: 



Source of variation 


Sum of squares 


Degrees of 
freedom 


Mean square 


Regression 
Residual 


479.323 
13,935.350 


2 

49 


239.66 
284.39 


Total 


14,414.673 


51 


... 



The sums of squares may be computed from any two of the following : 

Y \2 ... y y2 {2-4* ) 

n 

(51 Y) 2 

- Fr) 2 - 



n 
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The mean square due to regression is seen to be even less than that not 
associated with regression. No F test is necessary; clearly the relation- 
ship 

Y = 171.258 - 0.4133 U + 28.7507 

is inadequate. 

3.17 Further examples. The following data on the tensile strength, 
hardness, and density of 60 specimens of die-cast aluminum are given 
by Shewhart (36). 



Tensile 
strength 
(pounds 
per 
square inch) 


Hardness 
(Rockwell 

E] 


Density 
(grams 
per cubic 
centimeter) 


Tensile 
strength 
(pounds 
per 
square inch) 


Hardness 
(Rockwell 
E) 


Density 
(grams 
per cubic 
centimeter) 


29,314 


53.0 


2.666 


29,250 


71.3 


2.648 


34,860 


70.2 


2.708 


27,992 


52.7 


2.400 


36,818 


84.3 


2.865 


31,852 


76.5 


2.692 


30,120 


55.3 


2.627 


27,646 


63.7 


2.669 


34,020 


78.5 


2.581 


31,698 


69.2 


2.628 


30,824 


63.5 


2.633 


30,844 


69.2 


2.696 


35,396 


71.4 


2.671 


31,988 


61.4 


2.648 


31,260 


53.4 


2.650 


36,640 


83.7 


2.775 


32,184 


82.5 


2.717 


41,578 


94.7 


2.874 


33,424 


67.3 


2.614 


30,496 


70.2 


2.700 


37,694 


69.5 


2.524 


29,668 


80.4 


2.583 


34,876 


73.0 


2.741 


32,622 


76.7 


2.668 


24,660 


55.7 


2.619 


32,822 


82.9 


2.679 


34,760 


85.8 


2.755 


30,380 


55.0 


2.609 


38,020 


95.4 


2.846 


38,580 


83.2 


2.721 


25,680 


51.1 


2.575 


28,202 


62.6 


2.678 


25,810 


74.4 


2.561 


29,190 


78.0 


2.610 


26,460 


54.1 


2.593 


35,636 


84.6 


2.728 


28,070 


77.8 


2.639 


34,332 


64.0 


2.709 


24,640 


52.4 


2.611 


34,750 


75.3 


2.880 


25,770 


69.1 


2.696 


40,578 


84.8 


2.949 


23,690 


53.5 


2.606 


28,900 


49.4 


2.669 


28,650 


64.3 


2.616 


34,648 


74.2 


2.624 


32,380 


82.7 


2.748 


31,244 


59.8 


2.705 


28,210 


55.7 


2.518 


33,802 


75.2 


2.736 


34,002 


70.5 


2.726 


34,850 


57.7 


2.701 


34,470 


87.5 


2.875 


36,690 


79.3 


2.776 


29,248 


50.7 


2.585 


32,344 


67.6 


2.754 


28,710 


72.3 


2.547 


34,440 


77.0 


2.660 


29,830 


59.5 


2 606 


34,650 


74.8 


2.819 
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Is the multiple regression of tensile strength (Y) on hardness (U) and 
density (V) significant? 



Source of variation 


Sum of squares 


Degrees of 
freedom 


Mean square 


Regression 
Residual 


524,681,187 
417,700,935 


2 
57 


262,340,593.5 
7,328,086 


Total 


942,274,585 


59 





262,340,594 



= 35.80. From Table VIII, for 2 and 57 degrees 



7,328,086 

of freedom, we need F = 4.98 for highly significant regression. Hence 
the equation found by Shewhart 

Y r = 150.988(7 + 15310.35F 

describes the relationship between tensile strength, hardness, and density. 
In the preceding chapter, it was shown that a linear relationship 
between the life of light bulbs and a certain test of filament wire was 
not statistically significant. A second type of test was made on the 
wire. Jennett and Dudding (24) report the following results, the first 
two columns showing the data already considered on page 103. 



LIFE OF BULBS TEST OF WIRE 



1605 
1120 
1320 
1225 
1055 
1390 
1385 
1700 
2070 
1395 
1105 



U 
276 
293 
288 
315 
305 
315 
306 
286 
289 
296 
335 



TEST OF WIBD 
V 

14.2 
15.6 
16.1 

15.2 
14.6 
21.4 
19.4 
18.9 
18.5 
20.8 



Would a multiple relation of the form 

Y r = a + bU + cV 

succeed where the linear relation between Y and U failed? 
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o = 4,336.29 

b 12.69 

c = 49.80 



Source of variation 


Sum of squares 


Degrees of 
freedom 


Mean square 


Regression 
Residual 


388,795.13 
481,227.37 


2 

7 


194,397.57 
68,746.77 


Total 


870,022.50 


9 





F = \* '*' = 2 - 83 - From Table VIII > for 2 and 7 degrees of 
68,746.77 

freedom, a value of F = 4.74 is required for significance at the 0.05 
level. The regression is not statistically significant even when two 
independent variables are included. 



NOTES 

3.18 Least squares and multiple regression. In fitting a regression 
plane 

Y r = a + 6*7 + cF 
to observed data by the method of least squares, 



is required to be minimum, Y being the observed values of the dependent 
variable. Write 



<p = E(7 - a - bU - cF) 2 



For <p minimum 



[6] 



- ~^ v * - a - bU - cF) = 
da 

~T = -2XX7 ~ a - bU - cV)U - 
06 



6V 
~dc 



- a - 617 - cTOF = 
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Furthermore d V/^a 2 , d*<p/db 2 , d*<p/dc z are positive. Rearranging [6] we have 
the normal equations 



which may be solved simultaneously for a, 6, and c. 

3.19 Computation of the sums of squares for multiple regression analysis. 
In order to reduce the labor involved in computing sums of squares, the 
following were suggested: 

- F) 2 = F 2 -nF 2 

- F) 2 = L(F r - Y)(Y r - F) - E(F r - F)(F r ) 
- L(a + bU + cV - 
= a^Yr + b^UYr + 



since 

r = (<* + bU + cV) = 



and similarly for ^UY r and 

- F r ) 2 = (F - F r )(F - F r ) 



- F r )(a + 6C7 

From the three normal equations, we have 
(F-rr)=0, (F- F r )C7 = 0, 

Hence 

- F r ) 2 = E(F - W 

- a - bU - cV)Y 
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3.20 Analysis of covariance. Furry (19) gives the following data on 
the breaking strength and thickness of starch films. 

BREAKING STRENGTH AND THICKNESS OF STARCH FILMS* 



Film 


Thickness 
(inches 
X 10" 4 ) 


Breaking 
strength 
(grams) 


Film 


Thickness 
(inches 
X 10~ 4 ) 


Breaking 
strength 
(grams) 


WHEAT STARCH 


RICE STARCH 


1 


5.0 


263.7 


1 


7.1 


556.7 


2 


3.5 


130.8 


2 


6.7 


552.5 


3 


4.7 


382.9 


3 


5.6 


397.5 


4 


4.3 


302.5 


4 


8.1 


532.3 


5 


3.8 


213.3 


5 


8.7 


587.8 


6 


3.0 


132.1 


6 


8.3 


520.9 


7 


4.2 


292.0 


7 


8.4 


574.3 


8 


4.5 


315.5 


8 


7.3 


505.0 


9 


4.3 


262.4 


9 


8.5 


604.6 


10 


4.1 


314.4 


10 


7.8 


522.5 


11 


5.5 


310.8 


11 


8.0 


555.0 


12 

1 O 


4.8 

40 


280.0 

QQ1 7 


12 


8.4 


561.1 


13 
14 


.0 

8.0 


ool. / 
672.5 


SWEET POTATO STARCH 


15 


7.4 


496.0 


1 


9.4 


837.1 


16 


5.2 


311 9 


2 


10.6 


901.2 


17 


4.7 


276.7 


3 


9.0 


595.7 


18 


5.4 


325.7 


4 


7.6 


510.0 


19 


5.4 


310.8 








20 


5.4 

4f\ 


288.0 

OJfl O 


CANNA STARCH 


21 


.9 


2o9.o 


1 


7.7 


7Q1 7 


DASHEEN STARCH 


2 


6.3 


I i/A . 1 

610.0 




3 


Q A 


710 n 


1 


7.0 


485.4 


o 
4 


O \J 

11.8 


1 1U. U 

940.7 


2 


6.0 


395.4 


5 


12.4 


990.0 


3 


7.1 


465.4 


6 


12.0 


916.2 


4 


5.3 


371.4 


7 


11.4 


835.0 


5 


6.2 


402.0 


8 


10.4 


724.3 


6 


5.8 


371.9 


9 


9.2 


611.1 


7 


6.6 


430.0 


10 


9.0 


621.7 


8 


6.6 


380.0 


11 


9.5 


735.4 








12 


12.5 


990.0 








13 


11.7 


862.7 



*Each value for thickness and breaking strength is the average of five or more film-strip tests. 
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CORN STARCH 


POTATO STARCH 


1 


8.0 


731.0 


1 


13.0 


983.3 


2 


7.3 


710.0 


2 


13.3 


958.8 


3 


7.2 


604.7 


3 


10.7 


747.8 


4 


6.1 


508.8 


4 


12.2 


866.0 


5 


6.4 


393.0 


5 


11.6 


810.8 


6 


6.4 


416.0 


6 


9.7 


950.0 


7 


6.9 


400.0 


7 


10.8 


1,282.0 


8 


5.8 


335.6 


8 


10.1 


1,233.8 


9 


5.3 


306.4 


9 


12.7 


1,660.0 


10 


6.7 


426.0 


10 


9.8 


746.0 


11 


5.8 


382.5 


11 


10.0 


650.0 


12 


5.7 


340.8 


12 


13.8 


992.5 


13 


6.1 


436.7 


13 


13.3 


896.7 


14 


6.2 


333.3 


14 


12.4 


873.9 


15 


6.3 


382.3 


15 


12.2 


924.4 


16 


6.0 


397.7 


16 


14.1 


1,050.0 


17 


6.8 


619.1 


17 


13.7 


973.3 


18 


7.9 


857.3 








19 


7.2 


592.5 











ANALYSIS OF VARIANCE OF BREAKING STRENGTHS 



Source of variation 


Sum of squares 


Degrees of freedom 


Mean square 


Among starches 
Within starches 


5,307,433.08 
1,987,918.13 


6 

87 


884,572.18 
22,849.63 


Total 


7,295,351.21 


93 


... 



The differences in breaking strengths from starch to starch are highly 
significant. But examination of the data indicates that at least some of this 
apparent significance is due merely to differences in the thickness of the 
starch film and not to any chemical superiority of certain starches over others. 
To determine the effect of thickness on strength, the relationships between the 
two must first be measured for each starch; for our purposes the best measure 
is given by the regression coefficient 6 (the slope of the regression line) of break- 
ing strength on thickness. 



where Y represents a breaking strength value and X represents the corre- 
sponding value of thickness. To illustrate, we have for sweet-potato starch 

26,658.76 - 26,022.60 
339.48 - 334.89 



= 138.60 
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We form the following table.* 



Starch 


E</ 2 


& 


E* 2 


b 


f (22 


** y ~ I> 2 


Wheat 


254,104.85 


2,310.53 


25.96 


89.00 


48,459.67 


Dasheen 


13,215.27 


156.39 


2.65 


59.02 


3,985.90 


Corn 


447,055.68 


1,866.60 


9.88 


188.93 


94,404.31 


Rice 


30,993.44 


417.61 


9.15 


45.64 


11,933.54 


Sweet potato 


105,772.34 


636.16 


4.59 


138.60 


17,602.50 


Canna 


232,013.75 


2,763.55 


46.61 


59.29 


68,160.32 


Potato 


904,762.80 


1,192.81 


36.66 


32.54 


865,952.22 


Total 


1,987,918.13 


9,343.65 


135.50 




1,110,498.46 



We should like to eliminate from the variation in breaking strength that 
part attributable to variation in thickness; to do so it would be convenient to 
use an average within-starch regression coefficient such as would be given by 
6 = 9,343,65/135.50. This is permissible only if the differences among the 
seven regression coefficients are not statistically significant. To test the 
latter, we have 



Source of variation 


Sum of 
squares 


Degrees of 
freedom 


Mean 
square 


Deviation of within-starch regression lines 
from average within-starch regression 
line 


233,111.22 


6 


38,851.87 


Deviation of observations from within- 
starch regression (error) 


1,110,498.46 


80 


13,881.23 


Deviation of observations from average 
within-starch regression line 


1,343,609.68 


86 





The calculation of the above sums of squares can be carried out in the 
following way: . 



The value of El/ 2 f r within starches for the entire table is 1,987,918.13. 
Also 

(9,343.65) 2 



135.50 



644,308.45 
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or 



- Fr) 2 = 1,343,609.68 



In (F F r ) 2 , deviations are measured from the within-starch regression 
lines, i.e., 

XF - 7 r ) 2 = 1,110,498.46 

By subtraction 

(F r - F?) 2 = 233,111.22 

The allocation of degrees of freedom is easily explained. For the total sum of 
squares, (F F;) 2 , there are the 87 degrees of freedom allocated to within- 
starch variation in the original analysis of variance less the 1 degree of freedom 
lost by the fact that the deviations are taken about the over-all within- 
regression line. Second, from the original 87 degrees for within-starch varia- 
tion we must subtract 7 degrees of freedom attributable to the 7 regression 
lines, leaving 80 degrees of freedom for the term (F F r ) 2 . By subtrac- 
tion, there remain 6 degrees of freedom for J(Y r Fr) 2 . 
Applying the F test, we find 



38,851.87 
13,881.23 



2.80 



which for 6 and 80 degrees of freedom is significant at the 5 per cent level but 
not significant at the 1 per cent level. It is up to the analyst to decide whether 
or not he will continue. We shall consider the regression coefficients to be not 
significantly different; they are presumed to be replaced by 



b . 



135.50 



We are now able to calculate adjusted breaking strength means for each 
starch, each strength mean being corrected by elimination of the effect of 
thickness. The figure below illustrates the situation. Clearly most of the 




Average within*, 
starch regression 
line Y?~Y= 68.96 aJ 



Thickness 
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variation in breaking strength is explained by variation in thickness. We 
have, for any mean Y a 

Variation Variation 

around due to 

regression regression 

P. = P + (P- - y?) + (Yr - P) 

We wish to eliminate the effect of the last term on P,. Hence we want 
corrected values of F, given by 

Corrected P. = P + (P. - Pf) 
or since 

Yr - P = bx 

Corrected p. = P. - bx 
The following tabular form is convenient. 



Starch 


Original mean 
breaking 
strength 
P. 


Mean 
thickness 
X 8 


X a - X 


bx = 
b(X 8 -X) 


Corrected 
mean 
breaking 
strength 


Wheat 


308.7 


4.90 


-3.00 


-206.9 


515.6 


Dasheen 


412.7 


6.33 


-1.57 


-108.3 


521.0 


Corn 


482.8 


6.53 


-1.37 


- 94.5 


577.3 


Rice 


539.2 


7.74 


-0.16 


- 11.0 


550.2 


Sweet potato 
Canna 


711.0 
795.3 


9.15 
10.19 


1.25 
2.29 


86.2 
157.9 


624.8 
637.4 


Potato 


976.5 


11.96 


4.06 


280.0 


696.5 



We can now judge the effect of the independent variable, thickness. The 
original analysis of variance is replaced by the following table : 



Source of variation 


Sum of squares 


Degrees of freedom 


Mean square 


Among starches 
Within starches 


99,947.77 
1,343,609.68 


6 

86 


16,657.96 
15,623.37 


Total 


1,443,557.45 


92 





The total sum of squares 1,443,557.45 differs from the previous total 
7,295,351.21, for the latter represented variation about the grand mean P, i.e., 



whereas the present total sum of squares represents variation of the observa- 
tions about a regression line fitting the 94 points, i.e., 
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The same is true for the present within-starch sum of squares. The among- 
starch sum of squares is obtained by subtraction. The necessity of computing 
these terms makes it desirable to set up at the outset a table of the form shown 
herewith. 



Source of variation 


El/ 2 


EI/Z 


E* 2 


Among starches 
Within starches 


5,307,433.09 
1,987,918.13 


56,268.32 
9,343.65 


600.16 
135.50 


Total 


7,295,351.22 


65,611.97 


735.66 



Using the equation given on page 122, we have 

(65,611.97) 2 _ 
735.66 



7,295,351.22 - 



1,987,918.13 - 



(9,343.65) 2 
135.50 



1,443,557.45 
1,343,609.68 



and the value 99,947.77 is obtained by subtraction. The degrees of freedom 
differ from the original table only in that there are 86 instead of 87 degrees of 
freedom for the within-starch (error) term, for now error is measured by 
variation about regression rather than variation about starch means, and 1 
additional degree of freedom is lost by the calculation of the regression coeffi- 
cient from the data. 
An F test yields 



= LQ66 



15,623.37 

which, for 6 and 86 degrees of freedom, is not significant. The differences in 
breaking strengths are attributable to differences in thicknesses and not to 
kinds of starches. Judgment on the basis of the original analysis of variance 
might have been misleading; the inclusion of the independent variable, thick- 
ness, was essential if proper conclusions were to be reached. 



CHAPTER IV 
SYSTEMATIC QUALITY CONTROL 

4.1 Introduction. The preceding chapters considered the design 
and analysis of industrial experiments which aim to identify the factors 
responsible for variable quality. The present chapter describes the con- 
tribution to this objective of a routinized system of recording quality 
data. 

In this chapter our objective is, essentially, to judge the quality of 
current output against standards. The standards may be set by 
technical commissions, by government, or they may represent the 
quality of the product during the past. If current output, as known 
from current samples of information, departs from the standard by an 
amount which is statistically significant, an economic loss may be 
involved. Output, the quality of which is significantly higher than 
intended, implies wastage of labor and materials; the dis-economies of 
lower than standard quality are equally obvious. If possible, the re- 
sponsible factor or factors should be immediately identified and removed. 

Standards formed from past experience may be based on the quality 
records of a fixed period of time during the past or on a period which 
changes as time goes on in order to incorporate the records of the more 
recent past. There are also variations of these schemes. We shall 
illustrate only the case in which the time period is increasing. That is, 
after the quality of the output of say the 50th day is judged against the 
standard of the previous 49 days, the data of the 50th day is added to the 
previous population, and the quality of the 51st day is judged by com- 
parison with the standards based on 50 days of data. The method of 
handling fixed or other varieties of shifting standards involves only 
minor changes in the following discussion and the reader can supply them 
for himself. 

Standards based on accumulated data are valid only if those data are 
homogeneous. Thus a standard in the form of a mean of say 20 weeks' 
data has little sense if the 20 weekly means differ significantly among 
themselves. We shall test each population for such homogeneity. In 
order to preserve the homogeneity of shifting populations it has occa- 
sionally been the practice not to add to the population without adjust- 
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ment any data on current output which departs significantly from the 
standard. This practice has no statistical justification and is not rec- 
ommended. 

Finally the populations must be large relative to the size of the current 
samples. This is clearly sensible from a practical point of view; more- 
over it is important statistically if we are to assume exact knowledge of 
such population parameters as X 1 and o-, which are or form the basis of 
industrial standards. 

4.2 Population: formation and homogeneity. Supplement B of the 
American Society for Testing Materials' " Manual on Presentation of 
Data " (1 ) gives the following information on an operating characteristic : 



Sample number 


Sample size n 


Mean quality X 


Standard 
deviation s 


1 


50 


35.7 


5.35 


2 


50 


34.6 


5.03 


3 


50 


32.6 


3.43 


4 


50 


35.3 


4.55 


5 


50 


33.4 


4.10 


6 


50 


35.2 


4.30 


7 


50 


33.3 


5.18 


8 


50 


33.9 


5.30 


9 


50 


32.3 


3.09 


10 


50 


33.7 


3.67 



From this information we want to estimate the mean X f and the 
standard deviation a of the population of 500 observations formed by 
combining these 10 samples. 

For k samples of size n\ } , n& with respective means Xi, -, Xk 
and respective variances sf, , sf , we know that 



and 

In the present example 
and 



; k 



X f = 34.0 
a = 4.51 



Is this population homogeneous in its mean? Assuming normality, 
approximately 90 per cent of the sample means (i.e., nine means) should 
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fall within %' =k 1.650*, where v% = <r/Vn. We have 

4.51 



N/50 



0.638 



Only five means fall within X f 1.650*. The population cannot be 
said to be homogeneous in its mean. 

Possibly the lack of homogeneity of the population is due to significant 
differences among the 10 sample standard deviations. The distribution 
of s for random samples from a normal population was given on page 46. 
For large samples, say n > 30, it is easy to show that this distribution is 
approximately normal with standard error 0- 8 equal to a/ v2n. We have 



= 0.451 



Five values of s fall outside the range a 1.65(r s ; the sample standard 
deviations differ significantly among themselves. The parameters 
of this population (X / and cr) could not be effectively used as standards 
with which to compare current quality. 

4.3 Example involving %. The data in the first three columns of 
the preceding table have been reported by Pettebone and Young (32). 
They cover 50 consecutive samples each of 14 observations; the quality 
characteristic is the Btu value of a mixed fuel gas. 

The functions of the various columns will be discussed presently. The 
tabular form used was originally given by Budding and Baker (10). 

In order to permit the reader to check all entries in the first three 
rows of the preceding table, the individual observations of the first 
three samples are now given. 

BTU VALUE 



Sample 1 


Sample 2 


Sample 3 


633 


546 


541 


537 


540 


534 


535 


542 


528 


540 


550 


538 


542 


541 


542 


547 


547 


540 


543 


546 


539 


536 


539 


538 


531 


542 


543 


530 


535 


533 


536 


542 


638 


534 


542 


637 


538 


544 


540 


541 


549 


539 
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4.4 The range. In the last column the range, which is the difference 
between the largest and smallest observations, is recorded. The princi- 
pal merit of the range lies in the ease with which it is computed, and its 
utility arises from the fact that for a normal population the mean range 
over k samples (k large) bears a fixed relationship to the more common 
measure of variation, the standard deviation a of the population. For 
k = 50, the mean range will be found to be 10.68. For sample size n = 
14, we find from Table VI 

^- 8 = 3.40676 
a 

or 

a = 3.135 

which is in good agreement with a = 3.26 found by the more efficient 
method that is illustrated on page 127. For small fc, say less than 10, 
or large n, say more than 15, the mean range method of estimating a is 
unreliable. In our example, the agreement between the two methods of 
estimating a happens to be good even for small fc. Thus for k = 2, the 
two methods of estimating yield 4.52 and 4.70; for k = 5, they yield 
3. 19 and 3.93. 

The columns of the large table contain the information needed for a 
simple quality control record. Thus, at the end of the 20th week, the 
mean and standard deviation of the accumulated population are 

X' = 539.06 

and 

9 = 3.59 

We wish to determine whether or not the mean of the sample of the 
21st week (based on n observations) falls within 20^ of X'; where 

a 3.59 ^ 

ox = -7= *= ;=. = 0.96 
V^ \/14 

It does not; hence the quality of the output of the 21st week does not 
conform to the standard (%' = 539.06). If records of influential 
factors, such as kind of coal burned, are kept simultaneously, the cause 
of lack of control can often be immediately spotted and corrected. 

The steps of this procedure is systematized in the following table. 
The calculations begin with the 20th sample so that the beginning popu- 
lation will be 20 times as large as the first sample to be judged. 
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Sample 
number 


X' 


A 
<r 


n 
size of 
following 
sample 


s\ 

a 
<rx = -7= 
V n 


X' - 2<r* 


X' -h 2<7* 


lof 
following 
sample 


Under 
con- 
trol 


20 


539.06 


3.59 


14 


0.96 


537.14 


540.98 


537.50 


Yes. 


21 


538.99 


3.60 


14 


0.96 


537.07 


540.91 


536.36 


No. 


22 


538.87 


3.56 


14 


0.95 


536.97 


540.77 


538.07 


Yes. 


23 


538.84 


3.54 


14 


0.95 


536.94 


540.74 


538.21 


Yes. 


24 


538.81 


3.54 


14 


0.95 


536.91 


540.71 


536.93 


Yes. 


25 


538.73 


3.52 


14 


0.94 


536.85 


540.61 


538.00 


Yes. 


26 


538.71 


3.48 


14 


0.93 


536.85 


540.57 


536.14 


No. 


27 


538.61 


3.47 


14 


0.93 


536.75 


540.47 


535.29 


No. 


28 


538.49 


3.43 


14 


0.92 


536.65 


540.33 


534.43 


No. 


29 


538.35 


3.39 


14 


0.91 


536.53 


540.17 


536.07 


No. 


30 


538.28 


3.36 


14 


0.90 


536.48 


540.08 


537.71 


Yes. 


31 


538.26 


3.35 


14 


0.90 


536.46 


540.06 


536.00 


No. 


32 


538.19 


3.34 


14 


0.89 


536.41 


539.97 


537.79 


Yes. 


33 


538.18 


3.32 


14 


0.89 


536.40 


539.96 


534.43 


No. 


34 


538.07 


3.29 


14 


0.88 


536.31 


539.83 


536.93 


Yes. 


35 


538.03 


3.28 


14 


0.88 


536.27 


539.79 


537.14 


Yes. 


36 


538.01 


3.24 


14 


0.87 


536.27 


539.75 


537.29 


Yes. 


37 


537.99 


3.25 


14 


0.87 


536.25 


539.73 


536.14 


No. 


38 


537.94 


3.25 


14 


0.87 


536.20 


539.68 


537.21 


Yes. 


39 


537.92 


3.25 


14 


0.87 


536.18 


539.66 


537.57 


Yes. 


40 


537.91 


3.24 


14 


0.87 


536.17 


539.65 


538.86 


Yes. 


41 


537.94 


3.22 


14 


0.86 


536.22 


539.66 


534.29 


No. 


42 


537.85 


3.23 


14 


0.86 


536.13 


539.57 


536.43 


Yes. 


43 


537.82 


3.22 


14 


0.86 


536.10 


539.54 


537.57 


Yes. 


44 


537.81 


3.22 


14 


0.86 


536.09 


539.53 


536.21 


Yes. 


45 


537.77 


3.20 


14 


0.86 


536.05 


539.49 


537.93 


Yes. 


46 


537.78 


3.22 


14 


0.86 


536.06 


539.50 


534.93 


No. 


47 


537.72 


3.22 


14 


0.86 


536.00 


539.44 


534.43 


No. 


48 


537.65 


3.24 


14 


0.87 


535.91 


539.39 


536.50 


Yes. 


49 


537.63 


3.25 


14 


0.87 


535.89 


539.37 


536.00 


Yes. 


50 


537 . 59 


3.26 

































The judgments in the last column are valid only if the population 
against which the sample is being tested is homogeneous. At each stage 
of the process the homogeneity of the population should be tested. 
Thus at the end of 20 weeks 

* 3.59 A M 
<ry 7=. = 7= = 0.9o 
Vn Vu 

X' = 539.06 
X' - 2<r x = 537.14 
X' + 2ff S = 540.98 
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In 8 out of 20 samples constituting this population (which is far greater 
than the 5 per cent of 20 or 1 sample which could be attributed to 
chance) the sample mean X fell outside the limits 

X' 



This lack of control is possibly attributable to significant differences 
among the sample standard deviations $;. Very roughly, for n as small 
as 14, we have 

a 3.59 nm 
a a = -= = = = 0.679 

V2n V28 

Two of 20 values of Si fall outside 2a a , only 1 above the allowable 
limit (1 in 20). 

We conclude that the population formed of the first 20 samples is 
clearly not homogeneous and the judgment of the mean quality of the 
21st sample given in the preceding table cannot properly be made. More 
often than not, in industrial practice, population homogeneity will be 
achieved only after months of effort, and a quality control program of 
the kind suggested in this chapter will not be immediately possible. 
In such cases the statistician can best serve by assisting in the design 
and analysis of experiments which aim to identify the causes of non- 
homogeneity. 

4.5 Example involving fraction defective p. A similar procedure 
is available wherever quality must be recorded simply as acceptable or 
not. The first three columns of the following table exhibit data, recorded 
by Shoumatoff (37), covering defects found in the primary inspection of 
standard radio tubes. 

The most important population parameters are the fraction defective 
p and the standard deviation a p , and these will constitute the standards 
hi the quality control program. Later in this chapter it will be shown 
that 



M. 

~\n 



p is the fraction defective in a sample of size n, i.e., pn is the number 
of defects in a sample, p is the population fraction defective, and 
8 = 1 - p, 5=1- p. 

The various columns of the following table show the necessary 
sample statistics together with constantly revised estimates of the 
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population parameters. Inasmuch as n differs from sample to sample, 

we record pq rather than . 

n 



Date 


Number 
of tubes 
inspected 
n 


Number 
of tubes 
rejected 
pn 


Fraction 
defective 
in sample 
(in %) 
P 


Total 
number 
of tubes 
inspected 
to date 
En 


Total 
defects 
to date 
2>rc 


P 
to date 
(in %) 
EP" 
~ En 


pq 


1 


16,484 


2,008 


12.2 


16,484 


2,008 


12.2 


1071 


2 


24,708 


2,719 


11.0 


41,192 


4,727 


11.5 


1018 


3 


27,599 


2,691 


9.8 


68,791 


7,418 


10.8 


963 


4 


28,545 


2,699 


9.5 


97,336 


10,117 


10.4 


932 


5 


31,530 


3,377 


10.7 


128,866 


13,494 


10.5 


940 


6 


8,588 


1,100 


12.8 


137,454 


14,594 


10.6 


948 


7 


19,574 


1,478 


7.6 


157,028 


16,072 


10.2 


916 


8 


28,644 


2,170 


7.8 


185,672 


18,242 


9.8 


884 


9 


29,256 


2,214 


7.6 


214,928 


20,456 


9.5 


860 


10 


32,605 


2,540 


7.8 


247,533 


22,996 


9.3 


844 


11 


9,314 


750 


8.1 


256,847 


23,746 


9.2 


835 


12 


16,163 


1,108 


6.9 


273,010 


24,854 


9.1 


827 


13 


25,601 


1,945 


7.6 


298,611 


26,799 


9.0 


819 


14 


22,170 


1,690 


7.6 


320,781 


28,489 


8.9 


811 


15 


26,462 


2,162 


8.2 


347,243 


30,651 


8.8 


803 


16 


7,955 


671 


8.4 


355,198 


31,322 


8.8 


803 


17 


11,908 


790 


6.6 


367,106 


32,112 


8.7 


794 


18 


23,162 


1,641 


7.1 


390,268 


33,753 


8.6 


786 


19 


24,154 


1,890 


7.8 


414,422 


35,643 


8.6 


786 


20 


25,287 


1,911 


7.6 


439,709 


37,554 


8.5 


778 


21 


4,955 


517 


10.4 


444,664 


38,071 


8.6 


786 


22 


20,095 


1,525 


7.6 


464,759 


39,596 


8.5 


778 



At the end of the 15th day, we have, for the accumulated population 
to that date 

p = 8.8 per cent 

pq = 803 

Does the mean quality of the output of the 16th day conform to the 
standard set from this short period? We have 



= 0.32 



The percentage defective of the 16th week is 8.4, which lies between 
p 20p. The advance in quality of 0.4 per cent from the standard p 
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is reasonably attributable to chance, and no inquiry is warranted. 
This is not true of the remaining days, all of which show significant 
departures from the standard. 
The final steps of the procedure are shown in the following table. 





. 




n 








P 






P 




number of 


p^= 






fraction 


Under 


Date 


(per 
cent) 


pq 


tubes in 
following 


--<$ 


p-2<r p 


P + 2or p 


defective 
in follow- 


con- 
trol 








sample 








ing sample 




15 


8.8 


803 


7,955 


0.32 


8.2 


9.4 


8.4 


Yes. 


16 


8.8 


803 


11,908 


0.26 


8.3 


9.3 


6.6 


No. 


17 


8.7 


794 


23,162 


0.19 


8.3 


9.1 


7.1 


No. 


18 


8.6 


786 


24,154 


0.18 


8.2 


9.0 


9.8 


No. 


19 


8.6 


786 


25,287 


0.18 


8.2 


9.0 


7.6 


No. 


20 


8.5 


778 


4,955 


0.40 


7.7 


9.3 


10.4 


No. 


21 


8.6 


786 


20,095 


0.20 


8.2 


9.0 


7.6 


No. 


22 


8.5 


778 

































As in the previous example, the homogeneity of the current popu- 
lation of sample fraction defectives must be examined. This is some- 
what laborious for variable n. To test the homogeneity of the pop- 



Date 


Op 

(in%) 


P ~ 2CT P 


p-f 2<r p 


1 


0.22 


8.4 


9.2 


2 


0.18 


8.4 


9.2 


3 


0.17 


8.5 


9.1 


4 


0.17 


8.5 


9.1 


5 


0.16 


8.5 


9.1 


6 


0.31 


8.2 


9.4 


7 


0.20 


8.4 


9.2 


8 


0.17 


8.5 


9.1 


9 


0.17 


8.5 


9.1 


10 


0.16 


8.5 


9.1 


11 


0.29 


8.2 


9.4 


12 


0.22 


8.4 


9.2 


1.3 


0.18 


8.4 


9.2 


14 


0.19 


8.4 


9.2 


15 


0.17 


8.5 


9.1 



ulation accumulated to the fifteenth day, we have p = 8.8 per cent, 
q = 91.2 per cent and 15 values of <r p , depending on n. All 15 sample 
percentage defectives fall outside these limits. There is, therefore, no 
evidence of control in the production of these tubes during this (far 
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too brief) 16-day period, and the judgment previously passed on the 
quality of the tubes of the 16th day must be rescinded. Allocable causes 
of variability are present and before any effective quality control pro- 
gram can be set up, as many as possible of these causes must be dis- 
covered and removed. 

A similar test of homogeneity must be carried out on each successive 
population. 

NOTES 

4.6 Probability of / defects. Let the probability of a defect be p. Let 
q be the probability that the piece is good, p + q = 1. If a random sample 
of n pieces is taken, each selection of a piece being independent of all others, 
what is the probability Pt of obtaining exactly t defective pieces and n t 
good pieces? 

The first t pieces may be defective and the remainder good. This probabil- 
ity is p'(? w ~'. But / defective pieces may be obtained in as many ways as one 
can form combinations of n pieces taken t at a time, namely, C?, where 

nl 



t\(n - 01 



Hence the answer is 



Pt = 



Now C?# n V is the general expression for the terms of the expansion of 
(q + p) n = 1, i.e., 



(q + p) n 



+ nq n ~ l p + 



2! * 
nqp n ~ l + p n 



Hence the successive terms of the above expression give the probability of 
0, 1, 2, , n defective pieces. For example, f or p - %, q = %, and n = 8, 
the probabilities are shown in the table below. 



t 


sin 
^t 


f-y 


p* - cjy-y 





1 


0.2325680 


0.233 


1 


8 


0.0465136 


0.372 


2 


28 


0.0093027 


0.261 


3 


56 


0.0018605 


0.104 


4 


70 


0.0003721 


0.026 


5 


56 


0.0000744 


0.004 


6 


28 


0.0000149 


0.000 


7 


8 


0.0000030 


0.000 


8 


1 


0.0000006 


0.000 








1.000 
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0.4 

0.3 



4.7 Mean and variance of the fraction defective. We shall calculate 
the mean and variance of such a distribution of probabilities. 

Consider an n-fold experiment (n = 8, above). Repeat it N times. Then, 
for the first moment oMi around the origin (oMi is the arithmetic mean), we 
have 



oMi = mean number of defects = 



/o X + /i X 1 + + / n X n 

N 



where 



/o is the frequency of zero defects, /i the frequency of 1 defect, etc. 

IT n i - *r n ( n 1) oo^, TLT n 

N q X + Nnq n ~ p X 1 + N q n ~*p* X 2 + + Np n 

oAf i = ^ ^^ 



N 



= np(q + p) n ~ l = np 



To find the variance we first compute the second moment oMz about zero. 
The method is due to Bowley (3). 

/o X O 2 + fi X I 2 + - + / n X n 2 

N 



oM 2 = ' 



n(n - 



np(g + p) > 
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But 



or 2 + (mean) 2 * 
= <r 2 + (np) 2 

.". a 2 = 0^2 (np) 2 = rc 2 p 2 + wp<Z rc*p 2 = npq 
In a similar way 




and 

= 3 + 

pqn 

Note that as n increases, V /ft > and ft* 3; the distribution of bino- 
mial probabilities approaches normality even for p 7* q. 

To illustrate some of the foregoing results: if the probability of a defective 
piece in a population is p = I /Q and if we draw at random n = 1000 pieces, the 
mean (expected) number of defects is pn = 167 and the standard deviation is 
vpqn = 11.8. As the distribution of frequencies of defects is approxi- 
mately normal for n = 1000, we conclude (from Table IV) that in the absence 
of all " causes " except that of random sampling variation, about 95 per cent 
of such 1000-observation experiments should have & frequency of defects within 




that is 
[1] 

It is, however, generally more useful to record limits on proportions (proba- 
bilities) than on frequencies. Each probability value is one nth of the corre- 
sponding frequency value; we have 

mean fraction defective = = p 



pq 



standard deviation of p = a 

n \ n 

Thus in about 95 cases in 100 the proportion of defects in the 1000-observation 
experiments should fall between- . . 

p 2 
* For any variate <p 



the cross-product being zero. In other symbols oAf 2 = <r 2 + 
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or 



| d= 0.00236 



which is, of course, equivalent to [1]. 

4.8 Limits for control. In setting limits of 2<T or zfc2er,, we expect, 
even in the ideal case of absence of allocable causes, to find the mean or 
fraction defective outside these limits in about 5 per cent of the random sam- 
ples. Thus, as we wish to investigate the reasons for every lapse in quality 
(outside 2<rj? or zb2o- p ) we shall 5 per cent of the time find no allocable 
cause whatever; for example, the sample fraction defective, though outside 
rb2(Tp of p actually did occur by chance. If broader limits, say rb3(r p , are set, 
the fraction defective will, in the absence of all but chance forces, fall outside 
these limits in only about 3 of each 1000 random samples. Although we shall 
then be less frequently searching for " causes " that do not exist, we shall 
more frequently not be searching for " causes " that may exist. Thus with 
3<7 P a deviation from p so large that it could be expected to happen only 
once in 100 samples would not be 

considered to be evidence of lack 
of control. 

This may be stated somewhat 
differently: if the limits are set 
at dz3or p and if the true value of 
the percentage defective p' for 
current output (from which the cur- 
rent sample is drawn) is far off the 
standard p, say at p + 3<r p , then 
as many as half of all random 
samples drawn from current out- 
put would indicate control, i.e., their percentage defectives would fall 
within 3cTp. With the same out-of-control value for the lot and limits 
of 2<Tp, only 16 per cent of the samples will fall within the control limits. 

Limits cannot be set with security until one has accumulated experience 
as to what limits are economic . It may be suggested that results between 2<r p 
and dz30-p will bear considerable investigation, whereas results exceeding 
db3(Fp should always receive extensive investigation. 

4.9 Notes on cr p and p. Variation in p from sample to sample, as measured 
by erp, is presumed to be unallocable, i.e., attributable only to the errors of 
sampling. In industrial data, variation in p from sample to sample is com- 
posed not only of such residual errors but of identifiable factors (in our exam- 
ple, differences in workroom humidity). It would not, however, be appropri- 
ate to include this element of variability in our estimate of <r p , for our purpose 
is to compare the actual variability in p with the variability that would be 
expected under ideal conditions (random sampling effects only). 

p actually varies for another reason, for in industrial practice samples are 
generally drawn without replacement from a finite population. Assume the 
population consists of 100,000 tubes, 5 per cent of which are defective. If the 
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first draw brings forth a bad tube, the probability of a defect is now no longer 
0.05 but 

4,999 

99,999 

which is not quite 0.05. This point is relevant if we are interested in deter- 
mining the proportion defective in the batch; but in quality control, where 
the interest is in spotting absence of control, the population may be con- 
sidered to be infinite. 



CHAPTER V 
SAMPLING AND THE RISKS OF PRODUCERS AND BUYERS 

5.1 Introduction. A lot of merchandise must often be judged 
acceptable or not on the basis of information provided by a sample drawn 
from the lot. In such cases, the producer and buyer will have to incur 
risks, respectively, of (1) having satisfactory lots rejected and (2) re- 
ceiving poor lots. If numerical values can be placed on these risks, we 
may, under certain assumptions to be stated presently, determine the 
size of sample to be examined and the value of the sample statistic which 
will differentiate acceptable from non-acceptable lots. Or, if the sample 
size and the value of the sample statistic are set by authority, the re- 
spective risks may be determined. 

5.2 Assumptions. In the methods used in this chapter, lots are 
assumed to be infinite in size relative to the samples drawn from them. 
It is also assumed that these infinite populations are approximately 
normal. For example, even though mean quality may decline from X f 
to Jt", the distribution of quality is assumed to remain normal. Finally, 
the method of sampling from the lots will always be the random method. 
Further assumptions specific to particular measures of quality will be 
discussed along with those measures. 

Some experience with the methods of this chapter indicates that these 
assumptions, while severe, do not prevent effective practical usage of 
these methods. 

This procedure will now be illustrated by four common measures of 
quality: arithmetic mean, fraction defective, standard deviation, and 
the coefficient of variation, the latter being the standard deviation divided 
by the arithmetic mean. 

For example, assume that the fraction defective is used as a measure 
of quality and that the producer's lots average p. This figure may be 
acceptable to the buyer who wishes, however, to be protected against 
the receipt of inferior lots, of quality pi or greater, where pi > p, for 
product defective to this extent will materially affect his operations. 
The desired specification will state that a sample of n specimens should 
be drawn from each lot and that lot be marked satisfactory if the sample 
shows no more than c defective specimens. Such a specification must 
consider two objectives: first, as already mentioned, the buyer's risk of 
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receiving highly defective merchandise shall be small (say 1 in 100), and 
second, the producer's risk of having normally good output (of quality p) 
rejected shall also be small. 

5.3 Producer and buyer risk, using means (Dodge, 9). A lot is to 
be judged acceptable or not on the basis of the value of the average 
quality J? of a sample of n pieces drawn at random from that lot. The 
producer who is presumed to be manufacturing the product at a statis- 
tically controlled or near-controlled plant average quality X f would like 
to run a small risk P of having a lot rejected. The buyer wants to run 
a small risk B of obtaining lots whose average quality is as low as X" 
or lower. We want to determine X and n. The situation is shown 
graphically below. 



Means of samples of size 
n from population of 
tolerane quality, 
mean*'! 
variance (T-~ 



Near-normal buyer's 
"tolerance^ population 
of mean X , 




Means of samples of size n 
drawn from producer's 
controlled population of 

mean X\ variance <T % 



Producer's near-normal 
population, with mean 
variance (T a 



variance <7 2 



Quality 



x-x" 



The producer's requirements are given by 
Y' T Y' Y 

./x yx ^f\. j\. 



and the buyer's requirements by 
X - X" 



In addition to the assumptions stated in 5.2, it is assumed here that 
the lot of " tolerance " quality X" has the same variance <r 2 as the 
lot of usual mean quality X '. It may also be noted that if the samples 
are drawn carefully at random, strict normality is not necessary for 
effective application of the present theorems. 

Supplement B of the American Society for Testing Materials' publi- 
cation " Manual on Presentation of Data " (1) gives the following data 
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on an operating characteristic. High values indicate high quality. 
Ranges are recorded rather than the standard deviations, for, as already 
suggested, ranges are easily calculated and for n < 15 a good estimate 
of o- can be formed from the mean range. 





Number of 
tests made 


Average quality 


Range 


1 


9 


37.6 


9.5 


2 


9 


31.4 


6.0 


3 


9 


34.7 


13.5 


4 


9 


35.8 


12.0 


5 


9 


38.5 


21.0 


6 


9 


34.2 


17.5 


7 


9 


36.1 


15.5 


8 


9 


32.3 


18.0 


9 


9 


35.0 


12.5 


10 


9 


33.9 


14.0 



Assume the producer wants to run no more than 2 chances in 100 that 
a lot will be rejected and the buyer wants to run no more than 1 chance 
in 100 that he will receive a lot with average quality less than 30. How 
many pieces n from each lot are to be tested and what is the sample 
average X which will differentiate acceptable from rejected lots? 

We have 

-T = 35 

X" = 30 
P = 0.02 
B = 0.01 

The "population" of data meets the requirements for control laid down 
in the preceding chapter. To compute the standard deviation or from 
the ranges : we know that for small samples all of size n and drawn from a 
normal population 

Mean range 



Standard deviation 



= X 



From our data and Table VI 

Mean range = 14.0 

X = 2.970 



= 4.714 
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From Table IV we find, corresponding to P = 0.02 and B 

e = 2.054 



0.01, 



/ = 2.326 



Finally 



= 2.054 
= 2.326 



from which 



35-Z 

4.714/Vn 

X-30 

4.714/Vn 

X = 32.65 
n = 17.1 



i.e., a sample mean of 32.65 and a sample size of about 18. 

Inasmuch as the lot is not indefinitely larger than the sample (n = 18) 
the results may be accepted only as rough approximations. 

In place of the foregoing, the buyer may prefer to stipulate that he 
wishes to run, say, no more than 1 chance in 100 that he will receive a lot, 
a certain percentage of the items of which will be lower in quality than 
a stated value. From such a stipulation, we may easily compute X" 
and proceed as above. Thus if in the present example, the buyer wished 
to run only 1 chance in 100 of receiving lots with more than 5 per cent 
of their contents below a quality of 25, we would have what is shown 
graphically below. 



From Table IV, 



or 
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This value of % n would now be placed in our equations and X and n 
could be found. 

5.4 Producer and buyer risk, using fraction defective. As already 
illustrated, quality is sometimes not recorded numerically but simply as 
good or bad. We want to determine the size of sample n to be randomly 
drawn for inspection or test purposes from a lot of size N and also the 
maximum number of defective pieces g the sample may contain for the 
lot still to be acceptable. The producer, as before, is presumed to be 
manufacturing the product at a statistically controlled fraction defective 
p and he wishes to run a small risk P of having his lots rejected. The 
consumer wishes to run a small risk B of receiving lots which have more 
than a proportion of p' defective. 

For these conditions, the producer's and consumer's interests are given, 
respectively, by 

r-n ripNriqN 
V r n ~ r - P 
~~ 



Z 

r=0 O n 

where 

(pN)l 



C? N = 



r\(pN - r)! 



with similar expressions for the other combinatorial terms. Consider- 
ing the lot to be indefinitely large, these become 

r=>n 

r=0+l 

r=g 

Erm^'r^'n r D 
L r p q = n 

r=0 

where 

Finally, if p and p f are under 10 per cent, which is common in industrial 
practice, and n is large (say over 100) the Poisson form of the above 
equations may be safely used. 

'- e -(pnr ye" 1 

2-* t L 21* ! 

ro4-l 7^1 r=0 T\ 
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5.5 Example of 5.4. Supplement B of the American Society for 
Testing Materials' " Manual on Presentation of Data" (1) gives the 
following data on surface defects on galvanized hardware. 



Lot number 


Sample size 


Number of 
defective 
pieces 


Lot number 


Sample size 


Number of 
defective 
pieces 


1 


580 


9 


17 


640 


3 


2 


550 


7 


18 


580 


4 


3 


580 


3 


19 


510 


6 


4 


640 


9 


20 


580 


8 


5 


760 


11 


21 


600 


8 


6 


760 


12 


22 


640 


12 


7 


510 


9 


23 


640 


9 


8 


550 


10 


24 


580 


8 


9 


640 


10 


25 


580 


8 


10 


640 


10 


26 


510 


4 


11 


640 


8 


27 


640 


6 


12 


640 


10 


28 


550 


8 


13 


580 


7 


29 


550 


8 


14 


580 


9 


30 


430 


3 


15 


550 


5 


31 


430 


6 


16 


430 


5 




... 





How many pieces should be taken in each sample, and what is the 
largest number of defective pieces a sample may contain for the lot still 
to be accepted? 

The producer's mean quality is given by 0.013 and the population 
may be shown, by the methods of the preceding chapter, to be under 
statistical control. Assume that the producer wishes to run not more 
than 1 chance in 100 (because of high manufacturing costs) of having 
lots rejected while the consumer is willing to run as many as 5 chances in 
100 (because of relative ease of replacement) of having as much as 5 
per cent of the product defective. We have 

p = 0.013 
p' = 0.05 
P = 0.01 
B = 0.05 



Assume an answer, n = 200. We have 
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r\ 

r n /T-lO-O/'l 

- ^~ = 1 - 0.05 = 0.95 

From Figure 1, the first equation is satisfied by g + 1 = 7.5, approxi- 
mately. But the second requires g + 1 = 5.5, so n = 200 is not a solu- 
tion. By trial and error we come to n = 300 and g + 1 = 9.7 as an 
approximate solution. The sample size should be about 300 and the lot 
should not be accepted if it contains more than about nine defective 
pieces. 

Surface defects may be easily noted and at slight inspection expense. 
In such cases, 100 per cent inspection might be feasible. This would not 
be true wherever inspection was costly or where destructive testing 
was necessary. 

NOTES 

5.6 Hypergeometric law. Given a well-mixed lot of size N with a bad 
pieces and ( = AT a) good pieces. A random sample of size n is drawn from 
the lot. What is the probability P that the sample contains a bad pieces and 
b good pieces? 

A sample of size n can be drawn from a lot of size N in C% ways. Further, 
a bad pieces can be drawn from a bad pieces in C" ways. Similarly, b good 
pieces can be drawn in C? ways. Each of Co sets of a bad pieces can be paired 
with each of the Cf sets of b good pieces, i.e., the total number of ways in which 
a bad pieces and 6 good pieces can be drawn is C C&. Hence the required 
probability is 

-.' p _^a'Cl ^C a a'Cl 
G Ca-j-ft 

which is sometimes 'known as the hypergeometric law. ' 

Thu&, if an urn contains two white and -two black 'balls; the probability that 
a random sample. of two consists of one White and brie black ball is 



Fallacious answers to problems of this type can be avoided if one enumerates 
the equally likely cases. Thus our lot is 

A B C D 
0000 
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and the following are equally likely drawings for a sample size of 2 

A B B C 



AC B D 



AD CD 



four of which satisfy the requirements of the problem, i.e., 

4 



The hypergeometric law may be looked upon as the law of compound prob- 
abilities for the case in which the several probabilities are affected by previous 
drawings. To illustrate, consider the following problem from Fry (17) : 

A batch of 1000 lamps is 5 per cent bad. If five are tested, what is the 
chance that no bad lamps will appear? 

By the hypergeometric law 



-g. 950! 995! 
- c iooo ~ 945! 1000! 

The probability that the first lamp is good is 950/1000. If a good lamp is 
drawn and not replaced, the probability that the second lamp drawn is good is 
949/999. Finally, as all five lamps must be good to satisfy the conditions of 
the problem, 

950 1 949 f 948 1 947 946 
1000 '999*998*997* 996 

950! 995! 



1000! 945! 



= 0.7734 as before 



5.7 Binomial approximation. If the lot size N is indefinitely larger than 
the sample size n, the probability that a lamp is bad will not vary as lamps are 
drawn. If p = 0.05 is the probability of drawing a bad lamp, the probability 
of drawing 0, 1, n good lamps in a sample of n is given by the successive 
terms of the binomial 

(fl + PY 

Thus 

P = (0.95) 6 = 0.7738 

differing but slightly from the previous answer, as would be expected, for N 
is 200 times as large as n. 
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5.8 Poisson distribution. Given 

(q + 
write 

Pr - CV~ r ?> 



., N , 

r! (ft r)! 

P T being the probability of obtaining exactly r defectives in a random sample 
of n pieces drawn from an indefinitely larger lot. If n is large, p small, and 
m ( = pn, the expected number of defects) is a small finite number, the expres- 
sion for P r can be simplified. First write 



For r considerably less than n, the last factor will not differ appreciably from 
unity, i.e., 

q n-r ~ q n 

Replacing n! and (n r)! by the Stirling approximations 
n! = 



(n - r)! = 2ir(n - r) (n - rY~ r e~ < n ~ r > 
we obtain 

Pr = (1 - P) n P r 



l 



r! 

the equation of the Poisson distribution. 

In our first example n = 5, p = 1/20, pn = 1/4. The conditions for proper 
application of the Poisson approximation are not satisfied, for n is not large. 
The result indicates, however, a close approximation to the exact answer as 
given by the hypergeometric law, for 






= 0.7788 



5.9 Note on Figure 1 (p. 176). In a sample of size n, the probabilities 
of 0, 1, , n defects, as given by the Poisson law, are 



e~~ m m 

9 



0! 1! 2! n\ 
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The probability that a sample of n contains more than g defects (i.e., at least 
g + 1 defects) is 



r4+i r! 

Figure 1 gives the probability of at least c defects for various values of m, 
that is, 

r ~ n e~ m m r 

r-e f! 

which is equivalent to 

1-T 1 ~~ * 



-o r! 

5.10 Producer and buyer risk, using the standard deviation. Fre- 
quently, variability in the quality of a product may be even less desirable 
than low average quality. Metal strips all of about the same breaking 
strength and electric lamps all of about the same life may often be pre- 
ferred to batches of these products which are of higher mean quality but 
which contain many very good and many very bad strips or lamps. 

Crum (7) states that studies involving several hundred concrete beams 
used in paving projects in Iowa yield a standard deviation </ of about 
10 per cent of the mean quality. The latter is given by a modulus of 
rupture of 760 pounds per square inch. Assume that for a certain job, 
<r" = 20% is the buyer's tolerance variability. Producer and buyer 
want small risks, say, 1 in 100 of respectively (1) rejected lots, (2) less- 
than-tolerance quality lots. How many pieces should be drawn for test 
purposes from each lot and what should be the maximum standard 
deviation of the sample if that lot is to be accepted? 

The distribution of the variances s 2 of samples each of size n drawn at 
random from a normal population of variance a 2 is given by 

n(<t 2 \ d(* 2 } r(<P\ n ~~3l2 />-tt 2 /2<7 2 ,7( Q 2\ 
y \b ) \ju \o ) \s \o ) & u>\o ) 

where C is a constant. The distribution of s; is immediately derivable 
and has been discussed in 1 .37. It is most convenient, so far as available 
tables are concerned, to use the fact that the function 

us 2 



is distributed as x 2 with n 1 degrees of freedom. Values of x 2 are 
shown in Table VII. 
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In this example, the larger the value of cr, the poorer the quality. 
Hence, compared to the two previous examples, the producer's and the 
tolerance populations are reversed in position along the horizontal axis. 



Distribution of s^ from normal 
population of standard deviation a ' 



Distribution of &{ 

from normal population 

\of standard deviation a" 




Standard deviation 8 



We have 



P (producer's risk) = 0.01 
B (buyer's risk) = 0.01 
<r' = 76 a" ='152 

a'* = 5776 </' 2 = 23,104 

We want to determine n and s. We have for the population of a = 76 

r , ns 2 



5776 



= xp with n 1 degrees of freedom, and for the 



population of tolerance quality </' = 152 



[2] 



us 



23,104 



= xl with n 1 degrees of freedom 



In Table VII we have the probabilities of exceeding a value of x 2 for 
various degrees of freedom. We know neither x 2 nor the number of 
degrees of freedom, but we do know that we want the producer's chance 
of exceeding XP to be 0.01 and the buyer's chance of exceeding xl to be 
0.99. Also, from [1] and [2] 



2 

XB 
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From Table VII, using columns headed by probabilities of 0.99 and 0.01, 
we find for the above ratio of 4 about 24 degrees of freedom. (\ P = 
42.980, xl = 10.856.) Hence 

n = 25 

Substituting in either [1] or [2] we find 

s 2 = 9930, approximately 
$ = 99.7 

A sample should contain 25 items and have a standard deviation of not 
more than 100 pounds per square inch for its lot to be acceptable. 

5.11 Second example of 5.10. Welch (46) has given examples in 
which the size of the sample has, as is sometimes the case, already been 
fixed by authority. A manufacturer produces electric light bulbs under 
controlled conditions with a' = 0.8. Ten bulbs are to be sampled from 
each lot. The producer is willing to incur a 5 per cent risk of having lots 
rejected whereas the buyer wants to know what protection such a sample 
will, under these conditions, give him against obtaining lots as bad as 
*" = 1.5. 

We have 

2 

-75 = 15.62s 2 = xr 



For a producer risk of 0.05 and for nine degrees of freedom we have 
XP = 16.919. Hence 

s = 1.04 
Finally 

ns* 10(1.04) 2 2 

* = 5 "" = 4 ' 81 " XB 



Forxl = 4.85 and again with nine degrees of freedom, we have P = 0.85, 
which is the chance of exceeding x|. The buyer's risk is therefore 1.00 
- 0.85, or 15 per cent, a rather high risk. For better protection to him 
either the sample size n must be increased or the standard deviation </ 
reduced by more efficient plant control.* 

* If the producer uses inspection for control, then it is easily shown that for 
<r = 0.8 and n = 10, a value of s = 1.04 will likely result in the lot being thrown out; 
if so, the buyer's risk is practically zero. This applies in principal to all problems in 
this chapter. We presume, however, that the buyer desires protection quite inde- 
pendent of the producer's intentions. 
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The following graph illustrates the conditions and conclusions of the 
preceding example. 



^Distribution of S{ of random samples 
of size n 10 drawn from a normal 
.population of a '=0.8 




Distribution of 8$ of random 
amples of size n=10 drawn 
a normal population 
er"=1.5 



0.30 OUO ' 0.60 ' 0.80 1.04' 1.20 ' 1.40 ' 1.60 ' L80 ' 2.00 

General character of distribution of standard deviations s of samples of size n = 10 

S.12 Producer and buyer risk, using the coefficient of variation. 

Specification of average quality and variability in quality may be sepa- 
rately provided by the methods already discussed. It is sometimes 
desired to make use of one hybrid statistic which has both features. 
One is the coefficient of variation which is given by 

Standard deviation 
Arithmetic mean 

High values of this statistic will result from high variability in quality 
and low mean quality, both of which we take in our examples to be un- 
favorable. Correspondingly, low values of the coefficient of variation 
are considered favorable. 

Wilsdon (49) gives a frequency distribution showing the crushing 
strength, in tons, of 188 tests of a brand of brick. As the original data 
are not shown, the following population parameters are estimated from 
his frequency distribution of 188 observations. 

X' ~ 28.1 
a' = 4.1 

or the coefficient of variation at the works, is Vp = 0.1459. 

Assume that for a certain purpose a buyer is willing to accept brick of 
lower average strength and higher variability in strength. He is willing 
to run a risk of 5 chances in 100 (B) of receiving lots of coefficient of 
variation VB = 0.3. The producer whose statistically controlled output 
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is presumed to be characterized by Vp = 0.1459 wishes to run, say, no 
greater than a 1 in 100 risk (P) of having lots rejected. How many 
bricks should be tested, and what is the sample coefficient of variation v 
which divides acceptable from unacceptable lots? 
The function 

nv 2 



is distributed approximately as x 2 with n 1 degrees of freedom. This 
approximation is sufficiently accurate for V > 1/3 and n < 6. 
We have, for producer and buyer interests respectively, 



nv 



1 + 



nv 

TT 



= X 2 P 



= xl 



where Vp = 0.1459 is the coefficient of variation associated with the 
producer's ordinary output and VB = 0.3 is the coefficient of variation 
associated with the buyer's tolerance output. Correspondingly, xp is 
the value of x 2 associated with the producer's risk (0.01) and xl the 
value of x 2 associated with the buyer's risk (0.05) ; to find v and n. 
The following graphical description may be useful. 




As before, we divide one equality by the other and obtain 

-&' 48 



Entering the x 2 tables with probabilities of 0.01 and 0.95, we find the ratio 
4 to be associated with approximately 16 degrees of freedom. Hence 

n = 17 
Substituting this value of n and the appropriate value of x 2 into either 
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of the two original equations, we find 

v = 0.202 

A sample of n = 17 should be drawn and a sample coefficient of v = 0.202 
should divide acceptable and non-acceptable lots. 

5.13 Normal approximation to x 2 If instead of V B = 0.3, we 
had a more stringent buyer's tolerance level, say, VB = 0.2, we would 
have found 

[3] ^ = 1.83 

XB 

For this ratio no satisfying number of degrees of freedom can be found 
in the tables of x 2 , i.e., n 1 is greater than 30. In such a case, a normal 
distribution solution is possible, for 



is distributed normally with unit variance. 
We have 

- V2n-3 = 2.32 



the values 2.32 and 1.65 (associated with producer and buyer risks of 
0.01 and 0.05 respectively) being found from Table IV. From [3] and 
[4] we obtain 

n = 85, approximately 

v = 0.172 

As would be expected, if the buyer is to be protected against quality 
lower than VB 0.2 (instead of VB = 0.3) a larger sample and a smaller 
sample coefficient of variation will be required. 

5.14 Second Example of 5.12. Examples may be given in which n 
has already been specified by an industrial agreement or by a govern- 
mental authority. Pearson (31, &) considers a case in which VB = 0.200 
and n - 12. At what level VP must the producer control the quality 
of his product, and what shall be the value of v which separates acceptable 
from non-acceptable lots, in order that the buyer shall run a 1 per cent 
chance (B) of obtaining lots whose quality is given by a coefficient of 
variation VB = 0.200, and the producer a 5 per cent chance (P) of hav- 
ing lots rejected? We have 
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For 11 degrees of freedom and for B = 0.01 (which is equivalent to 99 
chances in 100 of exceeding xl) we have from Table VII, 



xl 



from which 



3.053 
v - 0.0995 



To calculate the necessary level of control FP, 

2 / 

we have 

n = 12 

v = 0.0995 
X/> = 19.675 
V P = 0.0776 



from which 



NOTE 
5.15 Distribution of v. McKay (27) is responsible for the proof that 



nv 



1 + *; 2 

is distributed approximately as x 2 with n 1 degrees of freedom. The ap- 
proximation is best when the coefficient of variation (V) of the normal popu- 
lation is small. Fieller (14) gives the following numerical results which show 
that as n becomes larger, the x 2 approximation improves. 

V = A , n = 6 



Chance of sample with smaller v 


Chance of sample with larger v 


v 


True value 


x 2 theory 


v 


True value 


x 2 theory 


0.15 


0.062 


0.067 


0.48 


0.053 


0.048 


0.14 


0.047 


0.051 


0.51 


0.034 


0.030 


0.13 


0.034 


0.037 


0.54 


0.022 


0.019 


0.12 


. 0.024 


0.026 


0.57 


0.013 


0.012 


0.11 


0.017 


0.018 


0.60 


0.008 


0.007 


0.10 


0.011 


0.012 


0.63 


0.005 


0.004 


0.09 


0.007 


0.007 


0.66 


0.003 


0.003 


0.08 


0.004 


0.004 


0.69 


0.002 


0.002 


0.07 


0.002 


0.002 


0.72 


0.001 


0.001 


0.06 


0.001 


0.001 


0.75 


0.001 


0.001 
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V - i it = 18 



Chance of sample with smaller v 


Chance of sample with larger v 


V 


True value 


X 2 theory 


V 


True value 


x 2 theory 


0.24 


0.084 


0.088 


0.42 


0.060 


0.058 


0.23 


0.058 


0.061 


0.43 


0.046 


0.044 


0.22 


0.038 


0.040 


0.44 


0.035- 


0.033 


0.21 


0.024 


0.026 


0.45 


0.026 


0.024 


0.20 


0.014 


0.015+ 


0.46 


0.019 


0.018 


0.19 


0.008 


0.009 


0.47 


0.014 


0.013 


0.18 


0.004 


0.005- 


0.48 


0.010 


0.009 


0,17 


0.002 


0.002 


0.49 


0.007 


0.007 


0.16 


0.001 


0.001 


0.50 


0.005- 


0.005- 








0.51 


0.003 


0.003 








0.52 


0.002 


002 








0.53 


0.002 


0.002 








0.54 


001 


001 









0.55 


0.001 


0.001 
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Several of the following books should be consulted in connection with the material 
of this book. 
AITKEN, A. C., Statistical Mathematics, Edinburgh, Oliver and Boyd, 1939. 

An introduction to mathematical statistics; considerably more elementary than Wilks' mono- 
graph. 

FISHER, R. A., Statistical Methods for Research Workers, 6th ed., Edinburgh, Oliver 
and Boyd, 1936. 

A book in which the author presents a unified treatment of modern experimental statistics, 
much of which is the result of his own contributions. 
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FISHER, R. A., The Design of Experiments, Edinburgh, Oliver and Boyd, 1937. 

The leading discussion of the place of statistical technique in the planning of field and laboratory 
experiments. The author is the originator of many of the proposed methods. 

MAINLAND, DONALD, The Treatment of Clinical and Laboratory Data, Edinburgh, 
Oliver and Boyd, 1938. 

Although the author's examples are from medical and dental research, the exposition of the 
assumptions and background of statistics and the treatment of statistical difficulties encountered 
by research workers are not to be found in other studies. 

PEARSON, E. S., Application of Statistical Methods to Industrial Standardisation and 
Quality Control, London, British Standards Institution, 1935. 

An introduction to quality control for industrial workers who have had practically no previous 
training in statistics. 

RIDER, P. R., An Introduction to Modern Statistical Methods, New York, John Wiley 
and Sons, 1939. 

A textbook on modern statistical methods. 

SHEWHART, W. A., Economic Control of the Quality of Manufactured Product, New 
York, D. Van Nostrand Company, 1931. 

A general treatise on the subject of statistical methods for quality control. 

SIMON, LESLIE E., Engineers' Manual of Statistical Methods, New York, John Wiley 
and Sons, 1941. 

This book is intended for those engaged in systematic quality control. There are numerous 
examples of the author's work in the United States Army. 

SNEDECOR, G. W., Statistical Methods Applied to Experiments in Agriculture and 
Biology, Ames, Iowa, Collegiate Press, 1938. 

A textbook, written with appreciation of beginners' difficulties, with many applications from 
the experience of the author and his associates. 

TIPPETT, L. H. C., The Methods of Statistics, 2d ed., London, Williams and Norgate, 
Ltd., 1937. 

A book for experimentalists, with many practical examples. 

WILKS, S. S., Statistical Inference, 1936-1937, Ann Arbor, Michigan, Edwards 
Brothers. 

For those with adequate training in mathematics who wish to get to the mathematical founda- 
tions of modern statistical technique. 
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TABLE III 
PROBABILITY POINTS OF a 



Size of 
sample 
n 


Probability points 


Mean 


Upper 

1% 


Upper 

5% 


Upper 
10% 


Lower 

10% 


Lower 

5% 


Lower 
1% 


11 


0.9359 


0.9073 


0.8899 


0.7409 


0.7153 


0.6675 


0.81805 


16 


0.9137 


0.8884 


0.8733 


0,7452 


0.7236 


0.6829 


0.81128 


21 


0.9001 


0.8768 


0.8631 


0.7495 


0.7304 


0.6950 


0.80792 


26 


0.8901 


0.8686 


0.8570 


0.7530 


0.7360 


0.7040 


0.80590 


31 


0.8827 


0.8625 


0.8511 


0.7559 


0.7404 


0.7110 


0.80456 


36 


0.8769 


0.8578 


0.8468 


0.7583 


0.7440 


0.7167 


0.80360 


41 


0.8722 


0.8540 


0.8436 


0.7604 


0.7470 


0.7216 


0.80289 


46 


0.8682 


0.8508 


0.8409 


0.7621 


0.7496 


0.7256 


0.80233 


51 


0.8648 


0.8481 


0.8385 


0.7636 


0.7518 


0.7291 


0.80188 


61 


0.8592 


0.8434 


0.8349 


0.7662 


0.7554 


0.7347 


0.80122 


71 


0.8549 


0.8403 


0.8321 


0.7683 


0.7583 


0.7393 


0.80074 


81 


0.8515 


0.8376 


0.8298 


0.7700 


0.7607 


0.7430 


0.80038 


91 


0.8484 


0.8353 


0.8279 


0.7714 


0.7626 


0.7460 


0.80010 


101 


0.8460 


0.8344 


0.8264 


0.7726 


0.7644 


0.7487 


0.79988 


201 


0.8322 


0.8229 


0.8178 


0.7796 


0.7738 


0.7629 


0,79888 


301 


0.8260 


0.8183 


0.8140 


0.7828 


0.7781 


0.7693 


0.79855 


401 


0.8223 


0.8155 


0.8118 


0.7847 


0.7807 


0.7731 


0.79838 


501 


0.8198 


0.8136 


0.8103 


0.7861 


0.7825 


0.7757 


0.79828 


601 


0.8179 


0.8123 


0.8092 


0.7873 


0.7838 


0.7776 


0.79822 


701 


0.8164 


0.8112 


0.8084 


0.7878 


0.7848 


0.7791 


0.79817 


801 


0.8152 


0.8103 


0.8077 


0.7885 


0.7857 


0.7803 


0.79813 


901 


0.8142 


0.8096 


0.8071 


0.7890 


0.7864 


0.7814 


0.79811 


1001 


0.8134 


0.8090 


0.8066 


0.7894 


0.7869 


0.7822 


0.79808 



TABLE IV 
NORMAL DISTRIBUTION AREAS 




X 

9 


0.00 


0.01 


0.02 


0.03 


0.04 


0.05 


0.06 


0.07 


0.08 


0.09 


0.0 


0.0000 


0.0040 


0.0080 


0.0120 


0.0159 


0.0199 


0.0239 


0.0279 


0.0319 


0.0359 


0.1 


0.0398 


0.0438 


0.0478 


0.0517 


0.0557 


0.0596 


0.0636 


0.0675 


0.0714 


0.0753 


0.2 


0.0793 


0.0832 


0.0871 


0.0910 


0.0948 


00987 


0.1026 


0.1064 


0.1103 


0.1141 


0.3 


0.1179 


0.1217 


0.1255 


0.1293 


0.1331 


0.13C8 


0.1406 


0.1443 


0.1480 


0.1517 


0.4 


0.1554 


0.1591 


0.1628 


0.1664 


0.1700 


0.1736 


0.1772 


0.1808 


0.1844 


0.1879 


0.5 


0.1915 


0.1950 


0.1985 


0.2019 


0.2054 


0.2088 


0.2123 


0.2157 


0.2190 


0.2224 


0.6 


0.2257 


0.2291 


0.2324 


0.2357 


0.2389 


0.2422 


0.2454 


0.2486 


0.2518 


0.2549 


0.7 


0.2580 


0.2612 


0,2642 


0.2673 


0.2704 


0.2734 


02764 


0.2794 


0.2823 


0.2852 


0.8 


0.2881 


0.2910 


0.2939 


0.2967 


0.2995 


0.3023 


0.3051 


0.3078 


0.3106 


0.3133 


0.9 


0.3159 


0.3186 


0.3212 


0.3238 


0.3264 


0.3289 


0.3315 


0.3340 


0.3365 


0.3389 


1.0 


0.3413 


0.3438 


0.3461 


0.3485 


0.3508 


0.3531 


0.3554 


0.3577 


0.3599 


0.3621 


1.1 


0.3643 


0.3665 


0.3686 


0.3718 


0.3729 


0.3749 


0.3770 


0.3790 


0.3810 


0.3830 


1.2 


0.3849 


0.3869 


0.3888 


0.3907 


0.3925 


0.3944 


0.3962 


0.3980 


0.3997 


0.4015 


1.3 


0.4032 


0.4049 


0.4066 


0.4083 


04099 


04115 


0.4131 


0.4147 


0.4162 


0.4177 


1.4 


0.4192 


0.4207 


0.4222 


0.4236 


0.4251 


0.4265 


0.4279 


0.4292 


0.4306 


0.4319 


1.6 


0.4332 


0.4345 


0.4357 


0.4370 


0.4382 


0.4394 


0.4406 


0.4418 


0.4430 


0.4441 


1.6 


0.4452 


0.4463 


04474 


0.4485 


0.4495 


0.4505 


0.4515 


0.4525 


0.4535 


0.4545 


1.7 


0.4554 


0.4564 


0.4573 


0.4582 


0.4591 


0.4599 


0.4608 


0.4616 


0.4625 


0.4633 


1.8 


0.4641 


0.4649 


0.4656 


0.4664 


0.4671 


0.4678 


0.4686 


0.4693 


0.4699 


0.4706 


1.9 


0.4713 


0.4719 


0.4726 


0.4732 


0.4738 


0.4744 


0.4750 


0.4758 


0.4762 


0.4767 


2.0 


0.4773 


0.4778 


0.4783 


0.4788 


0.4793 


0.4798 


0.4803 


0.4808 


0.4812 


0.4817 


2.1 


0.4821 


0.4826 


0.4830 


0.4834 


0.4838 


0.4842 


0.484G 


0.4850 


0.4854 


0.4857 


2.2 


0.4861 


0.4865 


0.4868 


0.4871 


0.4875 


0.4878 


0.4881 


0.4884 


0.4887 


0.4890 


2.3 


0.4893 


0.4896 


0.4898 


0.4901 


0.4904 


0.4906 


0.4909 


0.4911 


0.4913 


0.4916 


2.4 


0.4918 


0.4920 


0.4922 


0.4925 


0.4927 


0.4929 


0.4931 


0.4932 


0.4934 


0.4936 


2.5 


0.4938 


0.4940 


0.4941 


0.4943 


0.4945 


0.4946 


0.4948 


0.4949 


0.4951 


0.4952 


2.6 


0.4953 


0.4955 


0.4956 


0.4957 


0.4959 


0.4960 


0.4961 


0.4962 


0.4963 


0.4964 


2.7 


0.4965 


0.4966 


0.4967 


0.4968 


0.4969 


0.4970 


0.4971 


0.4972 


0.4973 


0.4974 


2.8 


0.4974 


0.4975 


0.4976 


0.4977 


0.4977 


0.4978 


0.4979 


0.4980 


0.4980 


0.4981 


2.9 


0.4981 


0.4982 


0.4983 


0.4984 


0.4984 


0.4984 


0.4985 


0.4985 


0.4986 


0.4986 


3.0 


0.49865 


0.4987 


0.4987 


0.4988 


0.4988 


0.4988 


0.4989 


0.4989 


0.4989 


0.4990 


3.1 


0.49903 


0.4991 


0.4991 


0.4991 


0.4992 


0.4992 


0.4992 


0.4992 


0.4993 


0.4993 


3.2 


0.4993129 




















3.3 


0.4995166 




















3.4 


0.4996631 




















3.5 


0.4997674 




















3.6 


0.4998409 




















3.7 


0.4998922 




















3.8 


0.4999277 




















3.9 


0.4999519 




















4.0 


0.4999683 




















4.5 


0.4999966 




















5.0 


0.4999997133 
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TABLE VI 

RATIO OF THE MEAN RANGE TO STANDARD DEVIATION 

The ratio of mean range of samples of size n to & of the normal population from which 

they are drawn 





Mean range 




Mean range 




Mean range 




Mean range 


n 


a 


n 


a 


n 


cr 


n 


<r 







10 


3.07751 


20 


3 73495 


30 


4.08552 


1 




11 


3.17287 


21 


3.77834 


31 


4.11293 


2 


1.12838 


12 


3.25846 


22 


3.81938 


32 


4.13934 


3 


1 . 69257 


13 


3.33598 


23 


3.85832 


33 


4.16482 


4 


2.05875 


14 


3.40676 


24 


3.89535 


34 


4.18943 


5 


2.32593 


15 


3.47183 


25 


3.93063 


35 


4.21322 


6 


2.53441 


16 


3.53198 


26 


3.96432 


36 


4.23625 


7 


2.70436 


17 


3.58788 


27 


3.99654 


37 


4.25855 


8 


2.84720 


18 


3.64006 


28 


4.02741 


38 


4.28018 


9 


2.97003 


19 


3.68896 


29 


4 05704 


39 


4.30117 





Mean range 




Mean range 




Mean range 




Mean range 


n 


cr 


n 


<7 


n 


<r 


n 


cr 


40 


4.32156 


85 


4.89789 


150 


5.29849 


400 


5.93636 


45 


4.41544 


90 


4.93940 


160 


5.34244 


450 


6.00903 


50 


4.49815 


95 


4.97841 


170 


5.38344 


500 


6.07340 


55 


4.57197 


100 


5.01519 


180 


5.42186 


600 


6.18340 


60 


4.63856 


105 


5.04997 


190 


5.45799 


700 


6.27510 


65 


4.69916 


110 


5.08295 


200 


5.49209 


800 


6.35358 


70 


4.75472 


120 


5.14417 


250 


5.63837 


900 


6.42211 


75 


4.80598 


130 


5.19996 


300 


5.75553 


1000 


6.48287 


80 


4.85355 


140 


5.25118 


350 


5 85302 
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a, 10, 11 

Vbi, 10, 11, 28, 34, 36 
6 2 , 27, 28, 34, 36 
Binomial, 133, 136, 137, 148 
Block, randomized, 57 
analysis of variance in, 64 

Charlier check, 31 
Chi square, 150 

normal approximation to, 155 
Coefficient of variation, 153, 156 
Control, kinds of, 3, 53 

limits for, 139 
Correlation, coefficient of (r), 49 

tests showing, 48, 49 
Co variance, analysis of, 120 

See also Variance 

Degrees of freedom, 13, 70, 111, 123 
Deviation, mean, 10, 12 

standard, 9, 12 
Distribution, normal, 9, 33, 37 

Poisson, 149 

Error, standard, 15 
of means, 38 

F test, 61, 62, 92 

applied to regression, 98 
Fraction defective (p), 133, 137, 145 
Freedom, degrees of, 13, 70, 111, 123 

Hypergeometric law, 147 
Interaction, 76 

L test, 86 
Li test, 17, 86 

Latin Square, analysis of variance in, 63 
features of, 54, 66 
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Least squares, application of method of 
109, 118 

Mean, arithmetic, 3 

variance of, 42 
Means, difference among several, 52 

difference of two, 1 

distribution of, 38 
Moments, 10, 34 

computation of, 36 

correction of, 32 

p (fraction defective), 133, 137, 145 
Pairing, 4 

Poisson distribution, 149 
Population, 1 

formation and homogeneity of, 127 

normality of, 9, 28 

variance of, 12 

Quality, characteristics of, 7 
control of, 126 

r (correlation coefficient), 49 
Randomization, 5, 53 
Randomized block, 57 

analysis of variance in, 64 
Randomized experiments, 53 

analysis of variance in, 58 
Range, 131, 143 
Regression, analysis of, 96, 98, 102 

coefficient of, 121 

curvilinear, 96, 110 

linear, 96, 99, 109 

in grouped data, 106, 111 

line of prediction of, 113 

multiple, 96, 114, 118 
Risk, producer's and buyer's, 141 

s 2 (variance), 45 

Selection of specimens, methods of, 6 

Size, of experiment, 7 
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Size, of Latin Square, 66 
Standard error, 15 

t test, 14, 44, 47, 48 
after F, 62 

u test, 13, 14 

Variance (s 2 ), 45 

analysis of, 63, 64, 68, 78, 96, 98, 102, 
121 



Variance (s 2 ), estimate of, 13, 43, 46, 
58, 93 

of difference of means, 42 

of fraction defective, 133 

of mean, 42 

of an observation, 12 
Variation, coefficient of, 153, 156 

X 2 (chi square), 150 

normal approximation to, 155 



